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Foreword 


The seven papers in this volume were originally presented at 
the second of two summer conferences on "Mathematical Applica¬ 
tions in Political Science” held at Southern Methodist University 
Ju!y 19-2 9 , 1964* and July 18-August 7, 1965. To further the inter- 
lsciphnary design of the second conference, four papers were 
solicited from political scientists and three from scholars in other 
disciplines. Contributors from political science are Hayward R. 
Alker, Jr. and Richard L. Merritt of Yale University and Gerald 
Kramer and William H. Riker of the University of Rochester. Other 
contributors are Otto A. Davis (economist) and Melvin Hinich 
statistician), Carnegie Institute of Technology, Carl F. Kossack 
(computer statistician) formerly of the Graduate Research Center 
of the Southwest and now of the University of Georgia, and Frank 
S. Scalora (mathematician) IBM-World Trade Corporation. 

sponsored by the NATIONAL SCIENCE 
FOUNDATION were conceived to assist political scientists in 
learning how mathematical applications may be effectively utilized 
m their discipline. The meetings were designed to afford oppor¬ 
tunities for the presentation of techniques and models involving 
statistical and mathematical applications and for high level dis¬ 
cussions devoted to determination of the limits and validity of 
these relatively advanced concepts as utilized in political science 
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Problems 


Introductory Note 

It would be surprising if the use of mathematics in any new field 
were spectacularly successful and encompassing from the outset. 

—Oskar Morgenstern 

In its inception, a few devotees of the new science of politics 
appear to have assumed that the magic of numbers, like Athene, 
springing full grown from the head of Zeus, would solve all 
problems of measurement, causation and correlative relationship. 
(These persons might have observed that in economics and psy¬ 
chology subjective hypothesizing is still in order, although the 
subjective element has been reduced and precision enhanced 
through the use of mathematics.) The leap of faith, substituting 
mathematics and mechanistic models for older dogmas as the 
objects of faith, implies a misunderstanding of the significance of 
their enterprise. Mathematical models, properly employed, offer 
the advantages of precision in definition, identification, and com¬ 
munication. They are not the be-all and the end-all of scientific 
inquiry. Subjective human agency is still relevant and essential in 
conceiving and formulating, identifying and analyzing, but in cer¬ 
tain important aspects of the process this agency and its attendant 
biases may be reduced or removed. 

Given the exorbitant expectations of a few bemused devotees, 
groping for the essentials of the new science, it is not surprising 
that other scholars, initially dubious, accepted the contrast between 
optimistic expectations and subsequent paltry achievements as 
conclusive evidence that the entire operation was a hopeless and 
permanent failure. Mathematical applications to politics (the 
horseless carriage of the social sciences) is merely a fad (and will 
never replace Old Dobbin), they declared. In fairness it must be 
conceded that some of the studies (certainly not all, or most) seem 
to justify harsh conclusions: superficial ecological correlations, 
addition of the nonadditive, models requiring unobtainable data, 
or other equally slipshod procedures were exhibited. 

It is easy to write off and consign to oblivion a new system, if 
one bases his conclusion on the obsolescent data of the earliest, 


3 







4 


MATHEMATICAL APPLICATIONS 


and often halting and confused, period of development. This 
fallacy of premature rejection is readily detectable when hind¬ 
sight is applied to most human endeavors. Imagine what a re¬ 
viewer might say, for instance, in freshly applying today’s sophisti¬ 
cated philosophical and methodological standards to the works of 
Kepler or Bodin. Yet these early scholars, with all of their limits 
and defects, were obviously important precursors of future con¬ 
tributions to the understanding of the cosmos and of man. 

The French military command in the thirties was correct, on 
the basis of the available data of 1917 and 1918, in dismissing the 
strategic value of aircraft and in doubting the ability of the tank 
to pierce strong, static fortifications. Their monumental miscalcu¬ 
lations were rooted in a reliance on obsolescent data, in a failure 
to anticipate technical improvements in the design of aircraft and 
tanks, and in their neglecting to keep pace with the effective inte¬ 
gration of these weapons into offensive military systems. Is it pos¬ 
sible that some critics of the new political science are prone to rely 
on obsolescent data and hyperbolic claims, drawn from the earliest 
stage and the least worthy disciples of the school? 

Andrew Hacker, in calling for the abandonment of “the hope that 
political analysis can be either objective or scientific,” may be 
correct, if “objectivity” and “science” are defined in the narrowest 
sense. Hacker’s fallacy, in calling for a return to “subjectivity” 
(Does he mean to glorify the concept and rest contented with the 
shopworn status quo ante?) may be due to a failure to perceive 
how scientific processes evolve. Let us suggest an alternative to 
Hacker’s policy, more realistic in the light of scholarly history and 
current developments: 1 

Let each student of politics follow the bent of his own tastes. 
Some will wish to remain subjective, including those who despair 
of being otherwise, or even those who prefer to be as subjective 
as possible. Others with the training and inclination will wish to 
join the quest for means of limiting subjectivity in the study of 
politics, rather than exulting in it. William H. Riker, observing 
the progress of economics, as an empirical science, one hundred 
and twenty years after the birth of Alfred Marshall, suggests that 

1 See Andrew Hacker, "Mathematics and Political Science” in James C. Charles- 
■worth (ed.), Mathematics and the Social Sciences (Philadelphia: The American Acade¬ 
my of Political and Social Science, 1963), pp. 58-76. 

A thoughtful and piercing review of Hacker’s article has been written by Arthur 
S. Goldberg. See American Political Science Review , Vol. LVIII, 3 (September, 1964) 
pp. 684-685. 
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the example of economics is relevant. “[It] is somewhat premature 
to forego the scientific enterprise [in studying politics].” 

The article by Hayward R. Alker, Jr. in this volume may well 
be an example of the kind of scholarship which enables a youthful 
scientific school to rise above the level of the fumbling and in¬ 
choate. The article reveals its author’s skill in mathematics and 
statistics, as well as in political science, and he faces squarely the 
difficult questions which arise from both directions. Moreover, he 
brings to the study a refreshing awareness of relevant literature in 
several sister social sciences. It can scarcely be charged that the 
paper addresses itself to the trivial. The centrality of the question 
of causation in empirical social science research is obvious. 

No doubt more will need to be said about the techniques of 
discovering and analyzing casual inference, but this article, in 
analyzing hierarchical and reciprocal concepts of causation, has 
achieved a maturity of temper and a sureness in handling in¬ 
tricacies which deserve emulation. 

Content analysis is a technique for developing systematic infor¬ 
mation about a body of raw data—a newspaper, for example—in 
order to derive useful inferences about the values and perceptions 
of those who produced the raw data or those who were influenced 
by it. In a very loose sense, of course, historians have been en¬ 
gaged in content analysis since the first document was examined 
and the first inference was drawn. The term “content analysis,” 
therefore, must be defined more precisely as a disciplined and 
quantitative study of contextual frequencies and associations, 
sometimes coded along attitudinal dimensions (such as “good- 
bad, strong-weak, etc.). As developed by Lasswell, Pool, Stone, 
Merritt, and others, the systematized study and analysis of content 
has become scientifically precise by contrast with the largely 
intuitive and impressionistic procedures of the traditional historian. 

Yet the problems of inference remain severe. Quantitative data 
drawn from the editorials of the New York Times , the Frankfurter 
Allgemeine Zeitung, or Pravda, for instance, are not self-evident 
indications of the values and perceptions of the publishers, the 
editors, or editorial writers, nor of the readership. Nor can we 
assume automatically that our analysis tells us what values the 
writers aimed to communicate to the readership. Editors do not 
customarily derive their editorials from quantitative models. Even 
if they employed the same model as the analyzers, there is no 
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assurance that their interpretation of findings would be the same 
as that of the analyzers. 

Despite problems of this nature, content analysis is a very 
necessary enterprise and one promising valuable returns. It is ex¬ 
ceptionally important for informing policy-makers under modem 
conditions of international politics. To be sure, it would appear 
dispensable, if Dr. George Gallup and the Michigan Survey Re¬ 
search Center were accorded free access to opinion leaders in the 
Peoples’ Republic of China or the U.S.S.R., or if the psychoanalyst 
who serves the editor of Pravda were in the pay of the C.I.A. This 
kind of information might appear most reliable, but we should 
want to check it through the use of other indicators, including 
those produced by content analytic techniques. 

Since our survey researchers and psychoanalysts do not operate 
freely behind the Iron Curtain, and since the measurement and 
analysis of attitudes, perceptions, and values are delicate problems 
under the best of circumstances, content analysis remains a vital 
key to understanding opinion leaders and publics—past, present, 
and future. Its considerable importance, therefore, makes it vastly 
important that the enterprise be subjected to tough-minded scru¬ 
tiny. This is the service performed by Richard Merritt in the 
article which follows. Merritt appraises his own field critically, 
and he offers thoughtful suggestions to guide further study. 



Causal Inference and 
Political Analysis* 

HAYWARD R. ALKER, JR. 

Yale University 

If we can define the causal relation, we can define influence, power, 
or authority, and vice versa. 

—Herbert Simon 1 

If power is the ability more or less coercively to get people 
to do things that they otherwise would not do, exercising power 
is a special case of causation. Authoritative decision-makers legiti¬ 
mately cause the well-being or deference accorded to some mem- 
bers of a society to increase and the value positions of others to 
diminish. Thus political analysis, as we usually define it, may be 
thought of as the study of the processes and outcomes of authorita¬ 
tive and coercive social causation. 2 The causal agents range from 
individual citizens to national governments and international or¬ 
ganizations; the arenas of their interaction include local communi¬ 
ties, states, nations, and international societies. 

Despite the centrality of causal inferences in political analysis, 
there has been a noticeable reluctance among political scientists 
explicitly to use causal language. Scholars would rather study 
influence,” or j power,” or “decision-making,” or “functional rela¬ 
tionships,” or “communication systems” than causal relationships 
per se, even though each of these concepts implies some kind of 
causal dependence of policy outcomes on decision-makers placed 
in varying sociocultural and political contexts. 

A number of reasons may be offered for this reluctance. In 
academic discussions philosophical objections are frequently men¬ 
tioned. Hume was the first but by no means the last skeptical 

,* In tIle prepa , ratI °n of thI ® P a P er 1 w as greatly aided by the thoughtful questions 
ofljb!?' P , Utatl ° naI . aSSIS !5 lCe of R °nald Brunner. Hubert Blalock has offered a^umber 

helpful suggestions. This research has been supported in part by the Yale Computer 
Center The Yale Political Data Program, and Northwestern University’s project on the 
Simulation of International Processes financed by TWGA/ARPA/NU (Advanr<v1 
Research Projects Agency, SD 260). (Advanced 

2 reference 45 p. 5 in the Bibliography of this paper. 

, e - ne , CeSSIty and prop f et y of distinguishing social causation from merely physical 
or biological causation has been argued at some length by Sorokin and Maclver (See 
references 40 and 46). While accepting the reality of physical and biological deter¬ 
minism, this concept implies the necessity of including human perspectives and activities 
m our explanations of social, and m particular political, phenomena. 
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philosopher to note that we observe repeated associations rather 
than causal relationships. The meaning of causality to many such 
skeptics remains unclear. Operational methods for establishing 
causal relationships seem to be largely unknown. 

Going beyond the objections of positivistic philosophers, per¬ 
ceptive students of political behavior have variously emphasized 
that politics involves reciprocal relationships between representa¬ 
tives and their constituencies, anticipated reactions of the strong 
and the weak, functional exchanges of leadership and support, and 
even negative feedback from the forces of nature to our political 
helmsmen. These scholars do not so much object to the use of 
causal language as doubt that causal theories about the complex 
interrelationships of political life can be either explicitly stated or 
empirically tested in a satisfactory fashion. 

In addition to positivistic skeptics and doubting behavioralists, 
the critics of causal reasoning also include moralistic humanists. 
Doctrines of mechanistic causation and historical determinism are 
rejected as violating a fundamental belief in the freedom of the 
will. Even if the physical world is strictly determined, man’s 
nature requires him to be both free and morally if not causally 
responsible for his actions. 

Among the social sciences, political science has been especially 
sensitive to the complexities of human behavior and to the respon¬ 
sibilities of moral choice. In rejecting doctrines of economic de¬ 
terminism, class warfare, or psychological behaviorism, many po¬ 
litical analysts have failed to learn of increasingly sophisticated and 
less objectionable treatments of the causal inference problem that 
have recently been proposed by economists, sociologists, psy¬ 
chologists, and other social scientists. This paper will review some 
of these developments, paying particular attention to the above- 
mentioned problems of operationally defining and testing causal 
relationships, modelling and testing reciprocal interactions, and 
somehow accommodating the doctrines of determinism and free 
will. 


I. Definitions of the Causal Relation 

Objections to the Humean “constant conjunction” definition of 
causality can usually be interpreted as implying this definition’s 
incompleteness rather than its incorrectness. After a brief review of 
some additions to and modifications of the Humean position, we 
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shall present and discuss alternative mathematical treatments of 
the causal relation particularly appropriate for the social sciences. 


A. Components of the Causal Concept. 

Asymmetry. Perhaps the most fundamental implication of the 
Humean viewpoint is the asymmetry of the causal relation. A 
sergeant s command causes a private’s response. The temporal 
asymmetry in this causal relation is clear as that between ligh tnin g 
and thunder—the cause comes before the effect. Also implied is 
a unidirectional relationship: if somehow we can get sergeants to 
issue certain commands, their privates will obey, but not vice 
versa. The temporal and/or directional asymmetry of causation 
has been widely emphasized in the writings of philosophers, 
statisticians, and social scientists. 3 It allows us to think of causal 
chains (47), (28), causal paths (53), causal funnels (11), causal 
hierarchies (22, 45), and even, in the more metaphysical formula¬ 
tions, of first causes and unmoved movers. 

Political reality obviously includes a number of symmetrical, re¬ 
ciprocal influence relationships: for example, bargaining, the ex¬ 
change of leadership for support, and arms races. If these can be 
studied from a causal point of view, a major difficulty in applying 
causal inference techniques to political phenomena will be over¬ 
come. Fortunately, several procedures have been developed for 
formalizing and testing causal models involving reciprocal relation¬ 
ships. A number of appropriate labels have been suggested: causal 
circles (52), reciprocal interaction (55), interdependent systems 
(29, 34), deviation amplifying feedback (41), and mutual causal 
processes (17). Since much of the rest of this paper will be de¬ 
voted to illustrating some of these concepts, we only note here that 
all of these modelling techniques use the asymmetry idea in model- 
ling reciprocal relationships, with or without assuming time lags 
between the initial and the feedback links. 

Contiguity. In addition to the asymmetry of the causality con¬ 
cept, a good deal of the relevant philosophical and social science 
literature stresses the contiguity of cause and effect. Physicists 
have long talked about the idea of “no action at a distance” (44, 


s Jor instance, see references 7, 12, 16, 21, 29, 33, 44. Simon’s notion of "unilateral 
couplings and causal orderings among variables (43, Part I) correspond very closely 
or example, to Deutsch s methods for establishing hierarchies in communication sys¬ 
tems: For our studies of communication ... we might be very interested in getting 
operational tests for what is subordinate, what is coordinate, what is entirely separate 8 
The test would be which feedback is coupled to the other asymmetrically” (18). 
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Chapter 4). Social scientists, at least those in the Lewinian tradi¬ 
tion, have stressed that in order for sociological variables to cause 
behavior, they must enter into the psychological field of the indi¬ 
vidual (11, 46). Both natural and social scientists have rightly in¬ 
sisted that completely adequate explanations should specify the 
mechanisms linking causes to their effects, (e.g., reference 28). 

Although there are difficulties with the contiguity assumption- 
contact between cause and effect seems always to be instantaneous 
—this emphasis has engendered a number of enlightening theories 
about specific links or pathways between natural or social causes 
and their effects. In an important paper, Miller and Stokes, for 
example, have compared the relative importance of congressmen s 
own beliefs about and perceptions of consistency opinions as in¬ 
fluences on roll-call voting behavior (42). At several points below 
we also shall discuss the relative importance of alternative causal 
paths and mechanisms linking causes such as constituency opinions 
to their effects in the political arena. 

Lawfulness. Herbert Feigl has suggested a “purified” definition 
of causality in terms of “predictability according to a set of laws 
(21, p. 408). Even allowing for the uniqueness in some ways of 
every event, this definition makes explicit the need in causal in¬ 
ferences for comparable cases, multiple observations, and empirical 
generalizations. The existence of causal relations is in this sense 
nearly identical with the assumption—either metaphysical or 
methodological—of the uniformity of nature (see 32, Chapter 14 
and the discussion of J. S. Mill in 44, Chapter 4). This definition 
thus makes more explicit the “theory-laden” nature of simple causal 
statements (28): sergeants’ commands are obeyed because army 
traditions of authority are strong and because privates usually find 
the cost of noncompliance to be too high, etc. A number of other 
possibly different influences on behavior are also assumed to apply 
whenever we make even a simple causal argument. 

This emphasis on empirical lawfulness helps to harmonize the 
mindlessness of the “constant or probable conjunction” view with 
more voluntaristic and humanistic outlooks. 4 In a recent critique of 
the Humean constant conjunction position, a British political phi¬ 
losopher, Alasdair MacIntyre, has argued: 


4 The Aristotelean view of causal laws can also be interpreted in both of these ways. 
For a statistical political example see the discussion and references in (2, pp. 3-5). 
The distinguishing of causal explanations from teleological, predictive, functional, and 
genetic ones will not be further undertaken here. (See, however, 32 and 44). 
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... to look for the antecedents of an action is not to search for an in- | 
variant causal connection, but to look for the available alternatives and j 
to ask why the agent actualized one rather than another. . . . The ex- ( 
planation of a choice between alternatives is a matter of making clear | 
what the agent’s criterion was and why he made use of this criterion j 
rather than another and to explain why the use of this criterion appears j 
rational to those who invoke it. (39, p. 61). j 

To reconcile these two points of view we need first to stress that j 
causal laws are not logically necessary or invarient but rather j 
empirically observed constant conjunctions, such as commands and 
actions; secondly, we need to discover repeatedly invoked de¬ 
cisional criteria explaining observed responses. Some, but not all, j 
causal arguments about human behavior do give such additional j 
explanations in terms of expectations of undesired punishment, etc. j 
Thirdly, we need to specify the historical context, constant on | 
changing, within which such generalizations are expected to hold. 

Some modes of causal explanation only implicitly explain why 
certain criteria of choice are used. In regression-like causal models, 
for example, undetermined coefficients represent choices of a par¬ 
ticular criterion of action—each nonzero coefficient suggests the 
relevance of a particular variable, but the unspecified magnitude! 
of the coefficient indicates a “degree of freedom in the model.! 
Voluntaristic and teleological explanations which plague many 
physical scientists (see 21, 40), are both relevant and necessary to 
complete such explanations. ! 

Determinativeness. A fourth connotation of the causation con¬ 
cept is frequently mentioned in the social science literature. Termi¬ 
nology such as “independent” and “dependent” variables (33), one 
variable “forcing” or “producing” changes in another (7), or the: 
“manipulative” or “operational” significance of a causal equation 
(45) all strongly imply the determinativeness of the cause on the 
effect. In perhaps the most sophisticated recent statement of this 
point of view, Wold has suggested treating causal equations as 
“autonomous behavioral relationships” for different groups of actors 
in the economy (e.g., producers or consumers), giving the value 
of the “response” variable “conditionally expected on the basis o£ 
known values of the “stimulus” variables and a ceteris paribus 
assumption about variables not explicitly included in the equation 
being discussed (52). This dependence of effect on cause is, always 
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seen as more than merely a statistical or logical asymmetric rela¬ 
tionship. (6, 49, 51) 5 

Either out of modesty about attempting to mention all the causes 
involved or because of a basic belief that reality contains both 
deterministic and chance relationships, most physicists and social 
scientists no longer attempt to employ completely deterministic 
models. This liberation of causal thinking from an oversimplified 
and strongly mechanistic assumption has probably benefited social 
scientists even more than physicists. Multivariate (i.e., many- 
variabled) and stochastic (i.e., probabilistic) theories (see 7, 14, 
34) have replaced many of the deterministic, single cause theories 
of the past (e.g., crude Marxism or Freudianism). As a result, in 
the mathematical formalizations of causal theory discussed below, 
the use of probabilistic or random terms will play an important 
part. So will the more modest goal of multiple causal explanations. 

Ceteris paribus. Both the determinativeness and lawfulness of 
causal relations require a tentativeness about causal inferences that 
is sometimes overlooked. As empirical generalizations they can 
always be proved wrong by a sufficient number of counter¬ 
examples. Tentativeness concerning ones conclusions is also re¬ 
quired because at some point one most assume that possibly con¬ 
founding variables have been adequately controlled for. All con¬ 
crete statements of causal relationships more or less explicitly 
make ceteris paribus assumptions. 6 An important consequence of 
this necessity is the need for caution about the extent to which 


a ra, l S ’ ^ exa “P Ie >. w . ouI d distinguish sharply between a correlational and 
!he ri Ul erP r tatl0n ° f StatlSt ' CaI , coefficl «its like Goodman and Kruskal’s tau or 
rJLn -°5 3 lmear regression. Both coefficients are asymmetrically, but not usually 
causally, interpreted. In addition, the causal viewpoint, unlike the merely "predictive” 
r p? U / reS ? peClal atte ntion to errors in the independent variables. (See 33, Chapter 

nfir'Jn 1 0S °u herS u ° n l 5 e ° ther hand ’ are speaking both determinatively and asym¬ 
metrically when they refer to causes as important sufficient conditions. (See 44, p. 339 

LazIrSd^S 6 7° ld A U c ase ab ° V£ ’ definitions of the causal relation by 

r a a , f d .’ Bla . lock : and Simon have all stressed the tentativeness and falsifiability of 
causal claims in view of the ceteris paribus assumptions involved in concrete causal 

twt r rr pro r d r- Tbus ^ arsfel , d : referring to ™™i p *** 

«,./ t 7° attributes X and Y within subcategories of a test factor C, suggests that 

if we h 3 ve a relationship between X and Y, and if for any antecedent test factor 
[G], the partial relationships between X and Y do not disappear, then the original re¬ 
lationship should be called a causal one.” (36, p. 14 6) Similarly, after assuming that 
all other ^ variables explicitly included in the causal model have been controlled or do 
not vary, Blalock says that "X is a direct cause of Y if and only if change in X 
produces a change in the mean value of Y.” The list of other variables being controlled 
is limited to those explicitly included in the model, but Blalock must also assume that 
the mean change in Y, for a given change in X is the same as the change that would 
always occur if all outside influences could be rigidly controlled.” (7, p. 19) 
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model outcomes may be due to violations of these assumptions. ( 

Historically, Ronald Fishers work on randomization techniques ! 
has provided a number of systematic methods for experimentally ! 
isolating particular explanatory relationships. More recent work by ; 
sociologists and economists has focused on ways of being more 
sure about or relaxing the ceteris paribus assumptions themselves , 
(5; 7, Chapter 5; 29; 52). Various of these techniques will be ex- , 
amined in some detail below. 

B. Mathematizing the Causal Relation. i 

When dealing with highly general scientific concepts, it helps to ; 
state them in a precise, abstract fashion capable of both logical 
and (when given a particular interpretation) empirical investiga- ; 
tion. A more immediate reason for using mathematical formaliza- [ 
tions of the causal relation is the extent to which previous formula- ; 
tions, from Aristotle on down, can better be understood by doing 
so. It then becomes clearer, for example, that logical necessity 
applies to the mathematical models rather than to the world they 
represent. Finally, the equation systems to be studied below make j 
parsimonious, yet often plausible and testible assumptions about i 
the asymmetry, contiguity, lawfulness, determinativeness, and 
ceteris paribus aspects of causal statements. 

Following the econometric tradition (especially Wold, 52) we 
shall assume (1) that the causal laws relating members of a par¬ 
ticular population can at least approximately be represented by 
linear, additive equations; 7 (2) that the philosophical belief in 
probabilism or partial determinism is adequately expressed by in- ; 
traducing “random” or “residual” terms in these equations, which , 
themselves may be considered as empirically based generalizations i 
about human behavior; 8 (3) that ceteris paribus assumptions will : 
be stated as assumptions about these “random terms”; (4) that in i 
each equation it will be possible to distinguish “independent” * 

7 Threshold phenomena and multiplicative relationships are discussed in 1, 8, 

14, 11. Linear additive or multiplicative models with discrete time subscripts and ran- i 
dom terms are very similar to some of the differential equation models used by Rapo- 
port, Coleman, and others. They differ, however, to the extent that they include (1) , 
only finite differences in variable values and change rates (rather than infinitesmal ones i 
as in the calculus); and (2) random or error terms whose effects can more realistically ; 
change and cumulate from one time period to another. I 

8 Economists often include in their models equations not susceptible to behavioral | 

interpretations. The complicating implications of including definitional relations, etc., j 
are discussed fully in (29, 34, 50). I 

THE HUNT LIBRARY i 

8ARNE6IE INSTITUTE OF TECHNOLOGY 





14 


MATHEMATICAL APPLICATIONS 


from “dependent” variables, 9 with particular coefficients indicating 
the magnitude of each causal link involved; and (5) that temporal 
asymmetries may be indicated by t subscripts on variables for the 
different times at which they occur. 

Linear causal systems. With these assumptions, we can repre¬ 
sent any system of causally interrelated variables by a set of linear 
equations like Equations (la). These equations are assumed to 
apply to all N members of some specified population: 


Xi -|- a 12 X 2 -f-... -j- ai G Xg "I" bn Zi -j- bi 2 Z 2 -|- ... -j- biH Zh — Ui 
a 2 i Xi -f- X 2 -j- ... -j- a 2G Xg *-{- b 2 i Zi -f- b 22 Z 2 -j- ... H - b 2 H Zh — U 2 


(la) 


Xi -j- a G2 X 2 -f-... -f- X G + bci Zi -j- b G 2 Z 2 -f-... -j- bcH Z H — U G 


Notationally, there are G endogenous (mutually dependent) vari¬ 
ables, denoted by X’s, and H exogenous (independent or prede¬ 
termined) variables, denoted by Z’s. It would be possible to have 
various time subscripts on the Z’s; if desired the Z’s could even be 
lagged values of the X’s, as when Z t = X 1(t -n, Z 2 = X 2(t -D, etc. We 
shall assume that there are G equations, each containing at least 
one endogenous variable as well as other variables causally influ¬ 
encing the endogenous ones, in particular a random term U and 
possibly other exogenous variables. For simplicity of exposition, we 
shall assume all the X’s, Z’s, and If s to have expected (mean) 
values of zero, and set the coefficient for one distinct endogenous 
variable in each equation equal to unity. 10 

Alternative simplifications. In causal systems like (la) each en¬ 
dogenous variable is determined by all other endogenous vari- 

In the econometric tradition, one may classify variables either in a single equation 
or in a set of equations according to this kind of distinction. For a system of equations, 
mutually dependent variables are called “endogenous” while those assumed to cause, but 
not to be caused by, the endogenous variables are known as “predetermined” or 
exogenous variables. It is sometimes useful to think of exogenous variables as those 
that could be experimentally controlled, along with the residual terms, and the en¬ 
dogenous variables as those whose resulting variation we wish to examine. 

Using boldface letters to indicate matrices, we can succintly represent the causal 
system (la) either by the matrix equation (lb): 

AX + BZ = U (lb) 

or the even more compact form of equation (lc): 

c Y = U . 

In these equations A and B are G x G and G x H coefficient matrices; X, Z, and U 
are all G x 1 column Vectors but also could be written as G x N matrices. C is a 
G x G H matrix composite of A and B while Y similarly contains G + H vari¬ 
ables. 
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ables, all exogenous variables, and a random or residual term. If ; 
we are to assume the specific values of the as, b’s and If s, to be 
unknown, we cannot estimate the magnitude of these terms with- ; 
out further assumptions relating either to the U’s or to the as and 
b’s. In designing models that are testable and whose parameters ! 
can be uniquely estimated, a number of further simplifications j 
must therefore be made. i 

It turns out that choosing among alternative simplifications of 
linear stochastic systems involves us in a number of the conceptual , 
controversies mentioned previously. The linear stochastic systems j 
approach not only calls attention to these issues—some contro- ’ 
versial decisions will have to be made before the model itself can j 
be tested and its coefficients identified—it also allows us to state : 
and argue the issues involved in a logical and empirical fashion. ! 

These alternative choices of model-building assumptions can be | 
arrayed along a number of dimensions. Specifically, for any multi¬ 
equation linear stochastic causal system, we must decide between: ; 

1) Hierarchical versus circular causation. The basic idea of ; 
hierarchical causal relationships is that there exists a ranking of * 
endogenous variables defined only in terms of other endogenous 
variables on which they are unilaterally dependent, and in terms | 
of exogenous variables. In such a system the highest rank would j 
be given to the first causes, the “unmoved movers” that depend * 
only on exogenous variables and residual terms. Lower ranked 
endogenous variables are assumed to depend unilaterally only on j 
higher ranked endogenous variables, exogenous variables, and 
residual terms. Because each variable is thus defined recursively 
(i.e., in terms of previously definable, higher ranked variables), 
such a set of equations is known as a recursive system. Because 
only unilateral dependencies occur in each equation of a recursive : 
system (the dependent variable never “causes” the independent | 
one) it is possible in an unambiguous way to estimate the coeffi- 1 
cients of such an equation without taking other equations into ! 
account (see 51, Chapter 2). 

In fact, this simple decomposability of recursive systems into 
the behavioral regularities of autonomous (but partly determined) 
variables or actions is a major reason for the attractiveness of such ! 
models—it allows us to think about autonomous actors, and simply 1 
to describe their behavior. Another reason for the attractiveness 1 
of such models is their testability. Making additional assumptions ^ 
about the residual terms, one can derive from competing recursive 
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models a number of empirical predictions on the basis of which 
cne model can be chosen over another. This procedure will be 
illustrated in Section II below. Recursive, hierarchical models even 
allow for feedback relationships among endogenous variables if 
we assume such interactions take time. Then the endogenous 
variable being fedback also enters into our equations in an earlier, 
exogenous form. 

If we feel reality or the approximate data that we can get from 
reality includes instantaneous feedback relationships, then a hier¬ 
archical rankings of endogenous variables is no longer possible. In 
this case all endogenous variables are not unilaterally dependent 
on logically prior variables. Estimating the coefficients of a single 
equation in such a relationship without taking the additional 
circularities into account gives us an incorrect picture of even the 
unidirection causal relationships that it contains (a mathematical 
illustration of this point based on work by Haavelmo is given by 
Valavanis in 50, Chapter 4). 

2) Incomplete versus complete causal specification. We can 
build models with or without exogenous variables. Recall that 
circular causal models appear to some as more realistic and to 
others as a practical convenience because good time specific data 
is not available. Similarly, the use of exogenous variables is thought 
by some to be more realistic and by others to be an unnecessary 
complication. 

From a theoretical standpoint treating both endogenous and 
exogenous variables as “independent” causes in a recursive system 
seems an unnatural complication. Not identifying the causes of 
the exogenous variables also smacks of incompleteness. There are 
a number of reasons for including exogenous variables, however. 
For nonrecursive models, they can give us sufficiently distinct 
equations so that we somehow grasp (“control” or “manipulate” 
in a quasi-experimental sense) each equation separately and 
uniquely identify its coefficients using multiequation estimation 
techniques (50, Chapters 6, 9). For recursive models, they can 
give us a larger number of prediction for testing the models (as 
in 9 and 51). For either recursive or non-recursive systems, by 
forcing us to make ourselves somewhat more specific about the 
other factors influencing the endogenous variables, we are reducing 
our dependence on possibly unrealistic assumptions about the resi¬ 
dual terms. 

3) Uncorrelated versus correlated residual terms. If residual 
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terms are uncorrelated, the causal system is assumed to be isolated 
to the extent that no outside variable effects more than one en¬ 
dogenous variable. As implied above, including exogenous vari¬ 
ables can help make the “uncorrelated residuals” assumption more 
realistic. Fortunately the uncorrelated residuals assumption is a 
testable one—as the methods for choosing among causal models, 
noted below, illustrate. 

If we drop the uncorrelated errors assumption, we are in a 
certain sense agreeing that causal systems cannot be as nearly 
isolated from reality as the uncorrelated residuals assumption 
implies. Imbedding our models to the extent that other variables 
might be assumed to cause residual changes in several of the en¬ 
dogenous variables is undoubtedly more realistic. Unfortunately, 
however, not assuming uncorrelated residuals mean that it is even 
harder uniquely to identify the equations of causal models. 11 

4) Static versus dynamic models. Whether or not we include 
explicit temporal asymmetries in our causal models is another 
major model building choice, capable of either philosophical or 
empirical argument. Econometricians more than sociologists, for 
example, have studied time-lagged relationships, probably because 
meaningful time units (e.g., fiscal years) and data have been more 
readily available. (Compare 29 and 34 with 7, 10,14 and 42). Two 
other related issues are also involved: the problem of making longi¬ 
tudinally valid (time series) predictions from cross-sectionally 
gathered (simultaneous) data and the extent to which statistical 
models assume or imply social systems to be in equilibrium. 12 

A basic problem is how to interpret the coefficients or their esti¬ 
mates obtained from simultaneous models in which no time sub¬ 
scripts explicitly occur. Coleman argues, for example (14), that 
assumptions about uncorrelated residuals and unchanging model 
coefficients are in effect equivalent to assumptions that processes 
underlying simultaneous data are in equilibrium with each other. 
He also has shown that coefficients from cross-sectional analyses 


11 Some of Wold’s most interesting work in implicit causal systems (52) concerns 
recursive models with correlated residual terms. Some such models give predictions 
equivalent to certain non-hierarchical systems. In order to make his models give de¬ 
terminate predictions, however, Wold has either to introduce exogenous terms or fail to 
identify some of the coefficients he employs. See also (53, 55). 

12 The cross-sectional versus longitudinal inference problem is discussed in a number 
of places. See, for example, (2, Chapter 5 and the references) and (50, Section 12.17). 
Coleman’s work on the equilibrium interpretation of causal models is outstanding ( 14 , 
Part II, and 15) and so is the wealth of material in the econometric literature (52, 
Section 2, and references). 
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have a simple longitudinal interpretation only when variable 
change rates are either negligible or constant (15). Perhaps the 
best we can say is that simultaneous results are more nearly 
causally interpretable in a longitudinal sense when re-equilibriating 
tendencies are quick and variable change rates are slow. 13 

Mixing simplifying assumptions. As implied above, different 
simplifying assumptions tend to go together. At one "extreme” 
there are those model builders (e.g., Blalock) who tend to use 
hierarchical causal relationships, no exogenous variables, uncorre¬ 
lated error terms and temporal equilibrium; at the other "extreme” 
are those (e.g., Koopmans) who tend to assume circular causal 
relationships and correlated residual terms and use exogenous 
variables and time lagged equations. Obviously there are no 
simple ordering principles that will explain why the four major 
issues in model building have tended to produce these two ex¬ 
tremes. 

A certain underlying set of attitudes about the decomposability 
or the interdependence of social reality seems, however, to exist. 
The hierarchical modellers act as if they believe reality to be more 
decomposable. This would mean that unilateral dependencies are 
adequate for describing causal relationships in which changes are 
slow or negligible and that causal systems can be satisfactorily 
isolated from their environments. If the uncorrelated residuals 
assumption is not too bothersome, then we need not introduce 
exogenous variables. 

The modellers of reciprocal, interdependent systems, on the 
other hand, may see reality as less decomposable and more dy¬ 
namic. Simultaneous reciprocal dependencies would then seem 
natural, as well as outside influences on several of the endogenous 
variables. Exogenous variables, in this view, are necessary to help 
identify the coefficients of the reciprocal relationships among the 
dependent variables, as well as to reduce difficulties created by 
variables affecting the residual terms. 

Perhaps a better organizing principle would be that the "hier¬ 
archical” modellers are at an earlier stage in theory-building. Bla- 


13 Econometricians usually make a number of additional assumptions that are also 
controversial, but will not be discussed here: (5) that the residual U’s have quasi¬ 
random, i.e. joint normal distributions, with zero means; (6) that the U’s are assumed 
not to be autocorrelated (i.e., correlated with previous values of themselves); (7) 
residual terms are also assumed to be uncorrelated with the explicit endogenous variables. 
(See 50, pp. 77-79). These assumptions are most useful when trying to estimate popu¬ 
lation parameters from sample statistics. 
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lock might argue, for example, that current data and theories don’t 
suggest or allow more realistic and more complex, yet falsifiable, 
models. Certainly the statistics of recursive systems (least-squares : 
analysis as compared to maximum likelihood methods) is much 
simpler than that of interdependent ones. This developmental > 
perspective would also explain the greater concern with testing 
alternative theories to be found in the “hierarchical” literature and 
the increased attention to parameter estimation techniques by the 
modellers of interdependent systems. 

Fortunately, a number of modelling approaches fall in between ; 
these two extremes. Blalock has recently discussed nonlinear ' 
models and is interested in dynamic systems (8); Boudon has ' 
recently dealt with parameter estimation problems in hierarchical, | 
static models with correlated residual terms (10); Wold has intro- ! 
duced the idea of implicit causal models that are dynamic, allow ! 
correlated residual terms, but maintain hierarchical relationships 
among equations denoting behavioral regularities of autonomous ■ 
actors (52). \ 


II. Hierarchical Causal Relationships 

We shall illustrate both the simplicity and the attractiveness of 
the hierarchical modelling approach with several models drawn 
from Daniel Lemers classic work on political development, The 
Passing of Traditional Society (37). Each of the linear stochastic 
models discussed will assume unilateral causal dependencies (re¬ 
cursive relationships), uncorrelated residual terms, and no explicit 
temporal asymmetries. Exogenous variables will not be employed 
m our estimating procedures nor considered as elements in paths 
linking causes to their effects. 

Lemer states his thesis about the universal modernizing role of 
literacy and media development in several places and in several 
ways. At one point he refers to the chicken and egg problem of 
reciprocal causal relations (37, p. 56) and forsakes speculative 
causal inference problems for testable correlational ones. But then, 
he argues in almost hierarchical causal fashion: 

... the Western model of modernization exhibits certain components 
and sequences whose relevance is global. Everywhere, for example, in¬ 
creasing urbanization has tended to raise literacy; rising literacy has 
tended to increase media exposure; increasing media exposure has “gone 
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with” wider economic participation (per capita income) and political 
participation (voting). (37, p. 56). 14 

Only in the “gone with” link does he use language that is clearly 
correlational and not causal. 

At another point (37, pp. 58-65), Lemer makes inferences from 
multiple correlations to three phase historical and causal sequences. 
Relations between stages, if not within them, also appear to be 
hierarchical (recursive and unilateral). 

Within [the] urban matrix develop both of the attributes which dis¬ 
tinguish the next two phases—literacy and media growth. There is a close 
reciprocal relationship between these, for the literate develop the media 
which in turn spread literacy. But, historically, literacy performs the 
key function. . . . The capacity to read . . . equips [the population] to 
perform the varied tasks required in the modernizing society. Not until 
a third phase . . . does a society begin to produce newspapers, radio 
networks, and motion pictures on a massive scale. This, in turn, accel¬ 
erates the spread of literacy. Out of this interaction develop those in¬ 
stitutions of participation (e.g., voting) which we find in all advanced 
modern societies (37, p. 60). 

Mathematizing the causal relations. Before presenting the causal 
models derived from Lemer s work, let us briefly mention the 
operational procedures used in measuring the variables we shall 
study. Urbanization will be measured by the percent of a nation s 
population living in cities of over 20,000 population size; literacy 
will also be a percentage measure: percent of population over 
age 15 that is literate as reported to UNESCO sources. Media 
development will be measured by a combined index (whose 
elements are highly intercorrelated) of media items; per capita 
radios, newspapers, telephones, etc. The political participation 
index is a weighted sum of voting turnout data and a measure of 
political enculturation (two indices of participation which them¬ 
selves are rather distinctly intercorrelated). 15 

Let us now explicitly state several possible models implied by 
the Lerner quotations. Since he himself is quite explicit about 


14 W. D. Burnham has reminded me that American political development obviously 
does not fit this scheme; its high male political participation came before high media 
exposure and economic development. The Lerner model makes more sense in European 


and Afro-Asian contexts. ... .... 

15 References are given in Figure 1. It is clear that the operationalizations are at 
best tentative, especially since they are assumed to have interval scale validity. Some 
of the most profitable work on causal modelling in political science will obviously deal 
with variables that have lower levels of measurement. For a variety of causal models 
applicable to nominal scale data, see 14, Chapters 4-6. 
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reciprocal linkages, our hierarchical models are obviously simplifi- 
cations of his thinking, useful mainly for illustrating causal model- ; 
ling procedures. Because all our data is from around the year ; 
1960, we in addition must assume rather than prove the relevance ; 
of cross-sectional analysis procedures for longitudinal inference. 

The simplest interpretation of the above quotes is a three-fold 
“stages of development” theory, in which urbanization brings 
about literacy, literacy then increases media development, which j 
in turn increases political participation. By allowing only the : 
developmental sequence of causal links above such an obviously | 
oversimplified approach sounds something like a civics course j 
(Emily Post version): come off the farm, learn to read, read the i 
newspapers, and then vote wisely. Mathematically, this theory may 1 
be represented by equations ( 2 a) and ( 2 b): 

X, =Ut 

a 2i Xi —f- X 2 ~ kb 

a 32 X 2 -f X 3 =U 3 
a4 3 X 3 -j- U 4 

XU 1 U 2 =XU 1 U 3 =XU 1 U4-XU 2 U 3 = XU 2 U4=XU 3 U4=0 

In these equations Xi indicates urbanization, X 2 literacy, X 3 media 
development, X 4 political participation; and Ui, U 2 , U 3 , U 4 are the ; 
corresponding residual causes, which are assumed to be uncorre- ; 
lated. In system (2a), Xi is the “dependent” variable in the first | 
equation, X 2 depends on Xi and U 2 as indicated by the second ; 
equation, etc. An arrow scheme representation of this “simple 
stages theory” is given in Figure l.A. j 

Figure 1. Three Causal Theories of National Political J 

Participation Levels.* (N—85) 

t 

A. Simple Stages Theory 
Urbanization (Xi) 

i ; 

Literacy (X 2 ) ; 

i ; 

Media 

Development (X 3 ) 

i ! 

Political ' 

Participation (X 4 ) 


(2a) I 

t 

(2b) : 
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Deductions Predictions Results 

bi 3.2 0 fi 3 —T 12 T 23 .41 vs. (.70)(.58) = .41 

b 2 4.3 0 r 24 =r 23 r 34 .66 vs. (.58)(.42)— .24 

bi4.23=0 r 14 =r 12 r 23 r 3 4 .42 vs. (.70)(.41)(.42)= .12 

B. Lerner Theory 
Urbanization (Xi) 

Literacy (X 2 ) - y Media Development (X 3 ) 

\ Political 4 / 

Participation (X 4 ) 

Deduction Prediction 

bi4.23 0 Ti4 r i2 r 2 4 —f- (r 34 — r 2 4 r 2 3 ) ( ri 3 — ri 2 r 2 3 ) 

1 — r 23 2 

Results 

.42 vs. (.70)(.66) + 

[A2 - (.66) (.58) ][.41 - (.70) (.58)]=.46 
1- (.58)(.58) 

C. Revised Lerner Theory 

Urbanization (Xi) 

Literacy (X 2 ) - y Media Development (X 3 ) 

\ Political 4 / 

Participation (X 4 ) 

Deductions Predictions Results 

bi 3. 2 =0 r 13 =r 12 r 23 .41 vs. .41 

bi4.23 0 rw r 12 r 24 -f- (r 3 4 — r 2 4 r 23 ) (ri 3 — ri 2 r 23 ) .42 vs .46 

* Data on urbanization and literacy are from Russett, Alker, Deutsch, Lass well, 
or Id Handbook of Political and Social IndicatOTS (blew Haven: Yale University Press, 
1964). Media Development is a factor index based on per capita radio, newspapers, 
telephones, and other data derived from the World Handbook and other sources. Po¬ 
litical participation data comes from Alker and Hopkins, reference 3. Basic ingredients 
are World Handbook data on percentage voting turnout and Banks and Textor data 
on political enculturation (The Cross-Polity Survey [Cambridge: M.I.T. Press, 1963]). 

A much more realistic theory for explaining high levels of na¬ 
tional political participation is given in Figure IB. As represented 
there, and in Equations (3a) and (3b) below, this "Lerner theory” 


h 

$ 
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suggests the urban situation (X.) as a cause of both literacy (X.) I 

Playing (W* ^ ™ th Bterac y ra ther than the mediai 

playing the key role m the obviously reciprocal relationship be- ' 

’7 0d V Cer ? a " d consumers. Out of the interaction of! 

IX ^ Mo? deV l l0P hlgher leVCis ° f mass P olitical Participation ' 

on‘the leltolTe eralTsto isTsS,™ ^ .‘^P^” variable ; 
unity. q agn 18 dlstln guished by its coefficient of j 

Xl =Ui | 

a 2iXi -f- X 2 — u 

a 3 iXi + a 32 X 2 -f- X 3 =u 3 (3a) 

a 4zX 2 -f- a 43 X 3 -f- X 4 =U 4 > 

2U ‘U J =0(i.i = l > .... 4;i74j) (3b) ! 

^•e b eaulllthat are •‘TT'I “ Equations (3b) that other things 
equal that residual influences on Xi X 2 X and Y ° 

correlated with each other. ’ ** un ‘ : 

It is interesting to compare causal systems (2ab) and flaM ^ 

™th die general (indeterminate) model (la)'4e new model] ' 

have been restncted in several ways. Besides the obvious omission 1 

of exogenous variables and the “uncorrelatpd” r^id l 1 

tion a subtler difference concerns the pattern of linkage*'or ’de' i 

pendence coefficients, the a’s. A sufficiently large numte of I 

aUtheT ^ UmCd t0 6111131 Zer0 “ S ° **■« * “ possible to arrange 
and cmamtng a coefficients into a triangular pattern (the X 
and a subscripts were in fact chosen so that Uris would ocem) In 
a geometric sense, this kind of coefficient “pyramid” corresponds 
exactly to the set of recursive relations that characterizeSrcffi 
cal causal systems. In Equations (2a) and (3a), X, is the W 
cause, literacy the second cause, and media the third. Each vari 1 
able is caused only by random influences and other variabhl ! 

1^1 ° aUSaI hieraTch y- Fi g“res PA and IB also suggest 

visuafly the same pyramid of unilateral causal dependencief^' I 

Choosing among causal models. The tentativeness of all scientific 
arguments, in particular causal ones, means that at best we wffi 
fail to reject one or several after submitting them to 4e tet of | 

also have been mad I CeeT-Vr^TW restrictions"^ equacions rel ating the a's, could 
present assumption of hierarchical causal relatioAshi^T^’ W ° U d probab b violate the 

one point omits this^link^l^gitenin pfgure lc ( ^ a)equaI to ze ro (Lerner at 

the next section of this paper. § lc * * wJ1 be discussed in more detail in ; 
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experience. To test the hierarchical theories m !■ J*' 

to make deductions from their mathematical models, which wh 
empirically reinterpreted) we may match with observed relatio - 

At least two equivalent strategies for making 
the models of systems (2a,b) and (3a,b) are possible (a ttod is 
described in footnote 20). First of all, we can multiply togethe 
pairs of equations in either model (e.g., 2a), average‘ 
ducts over all members of the population being studied, divi 
these products on both sides of the equals signs by appropriate 
standard deviations, and then set the SU.U, products equal to 
zero because of assumptions (2b) and (3b). These equations can 
then be solved for the values of the a u ., in terms only of, ° bse " ab l® 
variances and correlations; if enough a„ terms ave een‘ 
zero (more than C (4,2) = 6 in the present examples), ad ^° na 
predictions among observable correlation coefficients willalsob 
found. For more extended applications of this method the reade 
is referred to (45, Chapter 2; 2, Chapter 6; and 10). 

A second, simpler and more elegant approach is suggested m (7 
Chanter 3 and 51, Chapter 2). Since the equations of hierarchica 
causal models are autonomous behavioral relations, 
the “dependent” endogenous variable m each equation p 
solely on the independent variables in that same equation and 
whatever variables these independent variables themselves depend 
on Therefore, we can legitimately estimate by least squares o 
some oXr method the “oefficients of any single equatior, m 
hierarchical models with uncorrelated random terms without ta g 
the other equations explicitly into account. More precisely, the 
partial slopes corresponding to missing a c °f icwnts ™f™ u 
coefficient pyramid should always he zero ,f we control for aU 
other variables of higher causal order. Variables of hig er cau 
order are those in higher rows of the coefficient PI™ 1 " 1 , 

Turning to the models in Figure 1, this rule menus that three 
partial slopes in Theory A, one partial slope in Theory B, and 
partial slopes in Theory C should be zero. (These logical deduc- 
[iS from the causal model are given on the left below each of 
the arrow diagrams in Figure 1). When partial slopes equal zero, 
correspondingpartial correlations are also zero; therefore we can 
transform thfse deductions into the simplified ‘predictions con¬ 
cerning observable correlations given in the figure. ( Formulas or 
higher order partial stops and correlations are obtainable fro 
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Si '‘S” b0 ° i ?' ) TheSe P redictions 0“ be verified to be' 

me^ddlSwI ” ade ^ ^ 

the^vahJefo^t? 6 reSuIts . of our anaI y sis > the comparisons between' 
he vaiues of the expressions on either side of the prediction equa- 1 

*ons, we see first that one prediction of the “simple stages theory” ! 

exac y correct but that two others are way off. The Lemer 

eory, on the other hand, was one result that is within 0 04 of 

be'attributed to differ ] enCe / between correlations that could easily j 
Theorv” d v S , ai ^? lm g ( or measurement) error. 18 The “Lerner j 
ry, m w ch direct causal links between urbanization and 

ttonL addedS “ d . bet T! en literac y and P°li«cal participa- ' 
ded to the links of the simple stages theory thus better ! 
survives our limited empirical test. ’ j 

alst pkusTbl L ^ e M he0ry ” ^ Sh0Wn “ RgUre 1 ' C -’ however . is ' 

^ f between u ; baniza - 

the “simnlp =t<.„ » ’, , “ * 15 a com P r °mise between , 

the latter of thS S'*”? a " d ** ° riginal “ Lemer Theory.” Like 
! Zn . , theories, it too resists falsification. Whether or ; 

not the rewsed theory is preferred to the original theory depends 

on such additional criteria as parsimony and realism. ^ P 1 

Estimating carnal links and causal paths. Considering the num- ‘ 

attemttTnTtTt ? S f" d ,°Pf" a PP™“ns 
attempting to test a longitudinal theory with cross-sectional data i 

*e successful survival of the Lemer Theory is gratifying enough ! 

for us to proceed to estimate the magnitude of the links involved®" ; 

Especially interesting, once the as are known, would be some Wea ! 

politiLrpSSp^om 116 “ Hnking Ulb ““ | 

culate bMh° d gina1 ^ Lemer J 1 * 6017 ” 1JB )> we shall cal- 

culMe both dependence and path coefficients. By “dependence 

nrefdT W ® standar dized a coefficients in causafiy inter- ' 
p eted linear stochastic models; “path coefficients” are products ' 

thoroughly c*plored.™S«'™a'nd howeTO^o'r”'™ dellln « are as yet not | 
recursive models it seems reasonable to expect tKat s lnterestln S beginnings). For 
of supposedly zero partial slopes would be feasible ^ ° f the ma S nitud e 

vanance and normality assumptions are satisfied. 6 appropnate sa mplmg, equal 

arrow theories S from which predic°UoM f were er made he th" PlaUSlbl d' three ’ four ’ and five 
Since several other less plausible models make the ^ tun ! ed . out to be incorrect. | 
theory, additions, criteria*for ch^'L^'tht irTk££“°“ “ ^ 
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of dependence coefficients along particular chains linking a causa 
variable to one of its effects (see Wright’s extensive development 
of these ideas in 53, 54, 55, and Bouden’s applications m 9, 10). 

Dependence coefficients for the Lemer Theory were calculated 
by separate least squares regression analysis of each equation m 
system (3a). Table 1 gives these and related path coefficients. As 
we might expect from the good predictions of the Revised Lemer 
Theory, the urbanization media link is weak; so is that between 
media and participation. A more challenging (but debatable) 
inference concerns the relative strength of two pathways from 
urbanization to participation: urbanization -> literacy -> partici¬ 
pation and the urbanization -> literacy -> media -> participation. 
Going causally from literacy to participation seems to characterize 
modernization processes more than the more indirect route throug 
media development. 

Since our data downgrade the causal role of the media m direct 
contradiction to the emphasis in Lemer’s writings, a partial analysis 
of other correlation data (using the same causal model) was 
attempted to see whether or not measurement error in the media 
index was responsible. The subsequent analysis indeed showed the 
media -> participation link to be stronger and the literacy -> par- 
ticipation one to be weaker. 


20 Wrkht’s basic formula in the case of no correlated error terms is: Any correla¬ 
tion between variables in a network of sequential relations can be analyzed into con¬ 
tributions from all the paths (direct or through common factors [causes]) by wh ch 
the^two S variables are connected, such that the value ff 5 3 lpply£ 

product of the coefficients pertaining to the elementary paths. (53, p. 163). Appiymg 
S rule to the arrow model of Figure IB gives six equations, assuming «s to be for 
__ A~~A:~*A voriaWes: 


r i2 - a 21 

r 13 = a 31 + a 2* a 32 

r 14 = a 21 r 42 + a 21 a 32 a 43 + a 31 a 43 
r 2 3 = a 32 + a 21 a 31 

r 2 4 = a 42 + a 32 a 43 + a 21 a 43 a 31 

r 34 = a 43 + a 32 a 42 + a 31 a 21 a 42 


These equations can be manipulated to give the same prediction (r 14 . 23 - 0) 8»™by 
the first and second derivation methods already discussed. They are m fact the s 

p.163 f). ] sign from independent ones (this is not true for 

derivation procedures dl . codS- 

cients in 3c should each individually be preceded by a minus sign. 
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Table 1. Least-Squares Estimates of Standardized Dependence 
and Path Coefficients for the Lemer Theory of Modernization* 
(N=85) 


Dependent Variable 

Independent Variable 

Dependence Coefficient (a tj ) 

Literacy (X 2 ) 

Urbanization (Xi) 

a 2 i=0.70 

Media (X 3 ) 

Urbanization (Xi) 

a 3 i—0.063 

Media (X 3 ) 

Literacy (X 2 ) 

a 32 =0.57 

Participation (X 4 ) 

Literacy (X 2 ) 

a 42 —0.64 

Participation (X 4 ) 

Media (X 3 ) 

343=0.051 


Participation (X 4 ) 
Participation (X 4 ) 
Participation (X 4 ) 


Causal Path 

Xi —X2 —x 4 
Xi —)►- X 2 —)►- X 3 —)►- x 4 
Xi —X 3 — x 4 


Pof/i Coefficient 

P 124 — a 21 a 42 —0.45 

P 1234 — a2ia 32 a43—0.02 

P 134 “ a 3 ia 4 s—0.003 


* Estimates are derived by the least squares method applied to the model (3 a,b) 
and data of Figure IB. They are equivalent to estimates derivable from the path 
coefficient approach and Equations (3 c) using the observed correlational values 
r i2 r i 3 — r i 4 = -42, r 23 = .58, r 24 = .67, r 34 = .42. The path coeffi¬ 

cients in the table add to 0.47, indicating an error of 0.05 from the true value of their 
sum (r 14 = .42) if the model were exactly correct. It should also be noted that, un¬ 
like the convention of equations la, 2a, and 3a, positive dependence coefficients refer to 
positive dependence relationships. 

III. Reciprocal Causal Relationships 
In testing the model and estimating the parameters in the 
Lemer Theory, we had to remove any direct reciprocal ( ) 

links between two variables in order to get determinative results. 
Such links are a special case of non-hierarchical circular or feed¬ 
back influence relationships. To illustrate a theory-building pro¬ 
cedure not restricted to the assumptions of uncorrelated random 
terms and hierarchical relationships, but including exogenous 
variables as influences on the endogenous ones, we shall consider 
another set of propositions regarding national systems . 21 

Our method of theorizing will, however, be different from the 
Lemer example in that interrelated propositions will be derived 
from the interaction between traditional conceptions of political 
alternatives and more empirically based data analysis. It is hoped 

shall still have to assume that our findings indicate something like longitudinal 
(historical) causal relationships. All our equations may still be interpreted as probabi¬ 
listic behavioral generalizations (or laws), but they will no longer be assumed to 
indicate autonomous relationships. Because of the high level of aggregation in dealing 
with national systems, it will also be even more difficult than it was in the previous 
examples to identify specific actors, or within-nation mechanisms responsible for each 
of the causal links involved. Nonetheless we shall sometimes refer to people in either 
illustration in such terms as “media consumers and producers,” "voters,” etc., when 
these terms seem appropriate. 
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that this one small example will illustrate how many of the insights 
of the rich qualitative tradition can be partially translated into 
more mundane, but more precise and testable, theories about 
crucial reciprocal political relationships. 22 

In particular, we shall attempt to state, test, and estimate para¬ 
meters for several theories about reciprocal relationships between 
communism and democracy in economically developed Western 
societies. 23 Other mutually dependent variables will include levels 
of executive stability, political participation, and domestic group 
violence; each will be assumed exogenously and partially to depend 
on levels of urbanization, literacy, and economic development and 
other residual factors. 

Sources for the indicators to be used, and correlations among 
them, are given in Table 2. Endogenous, political indicators in¬ 
clude: 1) Communist noting as a percentage of national totals 
(Yale Political Data Program, etc.); 2) polyarchy (approximately 
as coded by Arthur Banks on the basis of the existence of a 
legitimate opposition, free press, elections, etc.); 3) domestic 
group violence (logged deaths as a fraction of population size, 
according to Rudolph Rummel); 4) political participation (as 
described in the notes to Figure 1, primarily an index based on 
voting turnout) and 5) average executive stability or tenure (ap¬ 
proximately as coded by the Yale Political Data Program). 
Literacy rates, urbanization and percapita Gross National Product 
are the exogenous variables. Symbolic labels for the endogenous 
(Xi, ... , X 5 ) and the exogenous variables (Z t , Z 2 , Z 3 ) are given in 
the margins of the table. 


22 The possibility of applying such models to "exchange” and "feedback” relation¬ 
ships such as consumer-producer exchanges has already been mentioned (see 17, 18, 19, 
45, 52). Similarly, Lerner presents a "circular” arrow model for the relationships 
among interest articulation, interest aggregation, and public communication, etc. (3 8, 
p. 348 ff) based on work of Almond and Coleman; he also discusses changes in the 
"vicious circle” of poverty necessary to bring about a "growth cycle” (3 8, p. 346 ff). 
Maruyama has reinterpreted Myrdal’s work on the growing gap between rich and poor 
nations (e.g., 43), in terms of deviation amplifying reciprocal causation (41). Literary 
commentators on politics also often stress cyclical relations: Sartre, for example, has 
emphasized how the oppression of the colons and the hatred of the colonial reinforce 
each other (see references in 20). 

23 Several of these clearly do not apply to Soviet Bloc countries or non-communist 
underdeveloped ones. Thus these theories will be more modest in their generality than 
those which Lerner claimed were valid "regardless of variations in race, color, creed” 
(37, p. 46). That some political (perhaps more than socioeconomic) "laws” differ in 
different contexts has been amply demonstrated. Ways of formulating (usually non¬ 
additive) causal theories that encompass such varieties have been discussed in (1, 14, 
15, and the references cited therein). 
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Table 2 : Correlations Among 3 Socioeconomic and 5 Political 
Characteristics of 36 Economically Advanced 
Non-Communist Nations* C ed 

Communist 1 
Vote (Xi) 1.00 .217 -.087 
Polyarchy (X 2 ) 1.00 .639 

Political Participn. (X 3 ) 1.00 -.53 
omestic Group Violence (X 4 ) 1 00 
Executive Stability (X 5 ) 

Literacy (Z t ) 

Urbanization (Z 2 ) 

Per Capita GNP (Z 3 ) 

indicators are takl^from K r °n UCtS f b ° Ve * 250 -00. Socioeconomic 

°ok of Political and Social Indicators I'YalJ’ TT* DeU ^ Scb ’ Hl Lassw ell, World Hand- 
Polmcal ones are factor indices deriv^ ‘ ( 1 Unlver ^ty Press, New Haven- ZZt 

Present paper. denved ln ref <*ence 3 and discussed in the text of til 

Mathematizing reciprocal causal relationshws Anr i 
lagram and an equivalent non-pvramidal F t cular arrow 

c ZitT^Z s ” e p ~ d * 

a,) ana polyarchy (X.) are antiSTto 

mestic group violence (X \ acn °^' ler - While do- 

the frustration it repreienfc or caus^TZhl ‘ 'T* ° f 

are assumed to encourage legitimatP 7 P°fyarchic systems 

courage or obviate ^ ‘° ^ 

tion (e.g., voting) is thought to Xe *c needfn f 
mumst voting is modelled as decreasing the chi 7 "f ! DCe; 00m - 
government (polyarchy). ^ e chances of democratic 

In a sense, these interrelated propositions fnlln f , <£1 
arguments of traditional apoIo|stsV h^ ° m ^ W ’ 
munism.“ Communism is suppled d ® moorac y and com- 

and violence and, in the longmn tn 1 / °J d ° mestlc revolution 

on the other hand is said L >’ ^ democrac y- Polyarchy, 

’ is said by its modem advocates to increase 

to‘Sj’thT^b^r 811 teTnfluenced them” CorreIa ! :i ^ matrices like 

and increasing participation. This didnh work ’ and P ol yarchy decreasing violence 
communism will bury polyarchy” argumemw ’7 w" * link ^valent to the 
I was unsuccessful in several attempts 8 meaningful T ?™‘ ng to reciprocal models 
s ill get a set of reasonable pamm«er tS estimaies SfUlly ^ indude eXecutive stability 0 and 
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popular participation in government and to decrease the likeli- 
hood of domestic violence. 

Figuke 2'. A Reciprocal Oamd E^bnmnR^ 

tirmshins Between Communist voting dim 
cratic Government in Economically Advanced Non- 
Communist Nations. 


A. Arrow Diagram* 


^ per capita GNP 

Zb)-- 


polyarchy 


V©t 

KlJf vyaranest- v ££ aitaa participation 

__.——group violence 

urbanization 
B. Linear Model 

Endogenous Political Units Exogenous 

X. + a “ A * 7 + b„Z,=U. 

a,,X, + X> + bslZ, ‘ ^ (4a) 

Y _i_ Y -4- bai Zi ~b ^ 32 

a 32 X 2 + Xa -t T +b 43 z 3 = U 4 

a 4 2 X 2 a « Xa + Xi 

* Continuous signed arrows refer to ’arrows indicate 

= sign“ S SS - A — 

Taken together these tu>o of uievinrpk P—% 

unstable circular relatumshrps f ^participation and 

voting, one via domestic violence, the °“ e ' jL. That an un- 
resulting changes m violence an c j t propositions is 

stable equilibrium is implied by these in com . 

shown by assuming a random me that chances 

rs d , ««*. 


Communist vote, 


w 

literacy 
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the extent of the nation’s democracy. An initial spurt for polyarchy, 
on the other hand, would probably end in a world of democracies 
if these propositions were correct! 


Table 3: Estimated Dependence Coefficients for Original 
and Revised Equilibrium Theories of Communist 
Voting - Democratic Government Relationships* 

Original Theory Revised Theory 


Variables 


Effect 

Cause 

(Figure : 

2.A) 

(Figure 3. 



Estimate f 

Theory 

Estimate f 

comm, vote 

violence 

ai4 = 

1.30 

X 

ai4 = 

-.49 

polyarchy 

comm, vote 

a 2i — 

-.40 

X 

a 2 i— 

-.40 

particip’n. 

polyarchy 

a 32 = 

-.45 

V 

a 32 = 

-.20 

violence 

polyarchy 

&42 — 

.17 

V 

a 42 = 

.24 

violence 

particip’n. 

8-13 = 

.46 

V 

&43 = 

.39 

comm, vote 

literacy 




bn= 

-.57 

comm, vote 

urbanz’n. 

b i2 = 

.27 

X 

bi 2 = 

.34 

comm, vote 

income 

b 13 = 

.44 

v 



polyarchy 

literacy 

b 21 = 

-.04 

V 

b 2 i— 

-.04 

polyarchy 

income 

b 23 — 

-.57 

V 

b 23 — 

-.57 

particip’n. 

literacy 

b 3 i = ' 

-.53 

V 

b 3 i=' 

-.58 

particip’n. 

urbanz’n. 

b 32 = ■ 

-.02 

V 



particip’n. 

income 




b 33 =- 

-.13 

violence 

income 

b^rzr 

.05 

V 

b 43 — 

.04 


Theory 

V 

V 

V 

V 

V 


V 

V 

V 


* Estimates were obtained by least-squares regression methods from the reduced 
°rms o quations (4a: Figure 2.A) and (5a: Figure 3A)„ The standardized regression 
coemcients in case w . ere tlie same * For dependent variables 1, 2, 3, and 4 (in that 
order) and exogenous variables 1, 2, 3 (in that order) they were: .42, _.32, _.12; 

i 20 ’ U V’ u* 2 -’ ‘f 2 ’- ' 24 , ; ~' 32, •04,—.25. A mimeographed sheet describ- 

mg the algebraic derivations involved is available from the author on request. 

t If the relevant link in the arrow diagrams of Figure 2 and Figure 3 has a plus 
sign this means the related a or b coefficient should be minus , and vice versa, because 
all the X s and 2 s are on the same side of the equals sign. 


Exogenous to these essentially political dependencies in Figure 
2 are a numbr of socioeconomic links with urbanism, literacy, and 
percapita GNP. Since this theory does not attempt to explain the 
causes of these variables themselves, we were able to draw on the 
earlier hierarchical modelling experience of this paper in choosing 
specific exogenous relationships. Use was also made of the simple 
correlations in Table 1. Specifically, literacy was assumed to in¬ 
crease both polyarchy and popular political participation, urbani¬ 
zation (a la Lenin) to increase communist voting and political 
participation, and high per capita income to increase the chances 
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of democracy, while decreasing the appeal of both violence and 
communist voting. This last trio of democratic attributes follows 
rather naturally, of course, from the work of Lipset and others on 
socioeconomic conditions for democracy. 

After mathematically specifying a complex model of political 
and socioeconomic interrelationships, there remained the problem 
of assuring oneself of the identifiability of the parameters implied 
by the model. If from the empirical distributions of both exogen¬ 
ous and endogenous variables, a coefficient estimating procedure 
will always lead to a unique set of estimates, a model’s equations 
are exactly identifiable. 25 The alternative situations are that a 
model is overidentified (several values of the as are predicted, 
or additional relations among correlations can be derived) or 
underidentified (in which case the model is indeterminate and 
we do not have enough information to obtain less than an infini¬ 
tude of coefficient estimates). Fortunately, the variety of exogen¬ 
ous links used in the equations of the reciprocal causal models in 
Figures 2 and 3 is sufficient for each of these equations to be 
identified. 26 

Identifiability is thus a theoretical property that may hold inde¬ 
pendently of the data used to estimate, validate, or falsify causal 
models. Stating or revising one’s causal models so that each equa¬ 
tion will be identifiable is obviously an important theory building 
problem. Basically, for any question, other equations in the model 

25 This problem pertains to both hierarchical models (in which no uncorrelated 
residuals assumptions are made) and reciprocal ones. In general recursive models are 
identifiable only if uncorrelated residuals are assumed (see 10). If we had added a 
direct causal arrow between urbanism and participation in the first version of the 
Lerner Theory (Figure IB), no excess predictions could then have been derived from 
the model, although equations (3c) would still have given determinate results for the 
a’s. Adding one more causal link (and a coefficient) would have simultaneously made 
the Lerner Theory non-recursive and non-identifiable. Six equations in seven unknowns 
(the old a’s plus two new ones) would have allowed an infinite number of solutions 
for the values of these a’s. A good introduction to the identifiability problem may be 
obtained by reading (Hood and Koopmans, 29, Chapter 2) and then (Valavams, 50, 

Cha 26 t ^ ct J a n y) we testec j on e model not presented here, with fewer b coefficients than 
those in Figures 2 or 3, that was "overidentified.” It proved unsatisfactory. 

Koopmans has stated that: "A necessary condition for the identifiability of a 
[behavioral] equation within a given linear model is that the number of variables 
excluded from the equation (more generally, the number of linear restrictions on the 
parameters of that equation) be at least equal to the number of [behavioral] equations 
less one. ... A necessary and sufficient condition for the identifiability of a [behavioral] 
equation within a linear model, restricted only by the exclusion of certain variables 
from certain equations, is that we can form at least one nonvanishing determinant ot 
order G-l out of these coefficients, properly arranged, with which the variables excluded 
from that . . . equation appear in the G-l other . . . equations. (29, p. 3 8) Tintner 
(48, Chapter 7) gives a simple illustration of the application of these conditions. 
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have to be different enough from it so as not to be confused with it. 

Estimating and testing reciprocal models. Because reciprocal 
models in general cannot be used to generate predicted relation¬ 
ships among observable correlation coefficients (49, Chapter 6), 
some other method is necessary for choosing among reciprocal 
causal models. The approach to be used here suggests eliminating 
or failing to eliminate reciprocal causal models primarily on the 
basis of correct or correctable theoretical predictions of the signs 
of estimated coefficients. Notice how in one sense we are prag¬ 
matically using a theory in order to test it; this approach has little 
payoff, of course, unless some a priori degree of belief can be gen¬ 
erated concerning the direction of particular causal relationships. 

Estimates of the a s and b s in Figure 2 could be obtained by 
a number of econometric methods, including maximum likelihood 
analysis. The one used here is generally referred to as a “sophisti¬ 
cated least-squares procedure.” 27 It assumes that the equations of 
a causal model, e.g. Figure 2, implicitly rather than explicitly indi¬ 
cate behavioral relationships. To get at the implicit coefficient 
values requires taking the simultaneous circular causal effects of 
dependent variables on themselves into account. Therefore, all 
endogenous variables in a set of causal equations have simul¬ 
taneously to be solved for in terms of the exogenous variables 
from which unbiased coefficient estimates may be obtained by 
least-squares procedures. Going back from “reduced form” least- 
squares estimates obtained from equations relating each X to only 
exogenous variables and random terms is, in fact, one of the prime 
reasons why the identifiability question is raised: can we derive 
unique values of the a s and b s from least-squares estimates based 
only on the reduced form equations? For identifiable equations, 
the answer is yes. 

Turning now to Table 3 and Figure 2, we see that three coef¬ 
ficients (a 14 , a 2i , bus) have had their signs incorrectly predicted. 
Looking at Figure 2, and changing in our mind the circled (in- 

27 The standardized coefficients were estimated by least squares analysis of the 
reduced form of the causal model of Equations (4a). This method allows for the 
interdependence of the X’s, solving for them in terms of exogenous variables 
and residual terms. 1 his sophisticated least squares procedure” gets around the prin¬ 
cipal objections raised by maximum likelihood advocates. (See 50, 51 regarding this 
and other estimating procedures.) The present results were obtained by tedious algebraic 
derivations of the reduced forms of Equations (4a) and (5a), then computerized least 
squares analyses, and finally the calculation of the a’s and b’s in the unreduced equations 
using these results. A more elegant approach to getting the X’s in reduced form is 
given by Equation (6), obtainable from Equation (lb) above- 

X = —A -1 BZ + A -1 U 


(6) 
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correct) arrow signs, we see that our model signs now show a cir¬ 
cular reinforcement of polyarchy by communist voting, via less 
domestic violence and more communist votes or via greater par¬ 
ticipation, less violence and again more communist votes. Un¬ 
stable equilibrium is implied. 

A careful reanalysis was necessary to see why the sophisticated 
least squares procedure produced estimates violating Figure 2 s 
theoretical predictions. Trying, if possible, to keep the same num¬ 
ber of missing exogenous variables in each of the model’s equa¬ 
tions—in order to retain the uniquely identifiable characteristic of 
the model, it was first decided to assume a positive impact of lit¬ 
eracy on communist voting in line with similar effect already 
assumed for total voting levels in Figure 1. From the reduced form 
regression analysis, income seemed a less promising agent of de¬ 
creased communist voting (decreasing concentration of wealth 
would have been better), so it was removed from this link and 
joined to public political participation. A weak link (urbanization 
participation) was also dropped. 

A close look at some particular cases (residuals analysis) helped 
suggest why two of the three wrong predictions had been incorrect 
(perhaps the third was due to improper specification of the exo¬ 
genous relationships). First of all, b« is positive because the high¬ 
est communist voting levels have occurred in countries like Fin¬ 
land, Italy, France, and Chile, none of which is terribly urbanized. 
In Western Europe, communism appears to be more character¬ 
istically rural than urban, as at least one leading Chinese theore¬ 
tician would like us to believe. Similarly, ai4 is positive because 
these countries have average or above average polyarchy scores. 
Communist voting in Western Europe indeed occurs most sig¬ 
nificantly in countries tolerating radical opposition. 28 The existence 
of circa 15%-2Q% communist voting levels helps to maintain (or is 
“functional for”) traditions and institutions tolerating dissent. Per¬ 
haps communist voters can legitimately relieve or outgrow the 
frustrations in this relatively harmless way. Apparently, commu¬ 
nist voting at these levels increases rather than diminishes system 
democracy. 

Testing of a revised “peaceful co-existence” reciprocal causal 
model (Figure 3) devised in light of this reanalysis gives more 
plausible size estimates for the a and b coefficients, as well as cor¬ 
rect signs for each of the twelve propositions summarized in Table 
3 under the Revised Theory heading. 
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Figube 3. A Revised Theory of Reciprocal Equilibrium Relaj 
tionships Between Communist Voting and Demo-- 
cratae Government in Economically Advanced Non-! 
Commumst Nations. 1 

A. Arrow Diagram * 
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dencies in political processes (see 15, 19, 54) that are more con¬ 
sistent withour original assumptions about the longitudinal causa 
interpretability of cross-sectional analyses. Introducing time dag 
relationships among these variables would then aHow growth, 
equilibrium, or decay in such adjustment processes to occur. 

IV. Causation and Freedom 

Determinism attracts the social scientist for a number of reas °™- 
Causal agents are seen as unmoved movers, while causal laws 
order chaotic experience. Thus causal explanations go below sur¬ 
face relationships to determinative realities But the use of partly 
deterministic causal models does not imply the absence of free 
choice even within the deterministic parts of these theones. Th 
real problem is the use of more or less coercive or “^b™* 1 P™®' 
“The distinction between free choice and behavior that co 
polled is drawn within the domain of causation. . . . A ree ® 01 
is not uncaused, but one whose causes include in significant mea¬ 
suring the aspirations and knowledge of the actor who is choos¬ 
ing”^ 32, p. 121). There are no apparent a pnon reasons why 

choices freely or rationally made should fa ^‘° " 

lawful regularities; the same possibilities should also exist to^co 
erced or taational choice. These regularities should not however 
be confused with the logical necessity of tautological mathematic 

relationships. 29 . c n 

Despite such arguments as these, mathematical mo es o p 
litiri7relationships for many political scientists continue to con¬ 
note the restriction of freedom rather than the satisfaction o 
curiosity or opportunities for political development. It may there¬ 
fore be^of value briefly to discuss the kinds of freedom assumed or 
implied in the mathematical models and causal theones discusse 
in previous sections of this paper. 

Political Choices. Unlike many of the recent quantitative_stadies 
of national political systems, the reciprocal theones of Part 1 
above have Stressed causal interdependence among political vari¬ 
ables" Levels of domestic group violence and polihca participa 
tion were seen both to depend on and to influence collective de- 

»Detailed discussions of the causation-determinism issue as it applies to human 

behavior may be found in references 22, 3 , ’ ’ 7 b ' strongly put in Samuel 

— « JScS — u. 

( 30 ). 
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cisions concerning the desired forms of political institutions. Each 
of these variables can and should be thought of as representing 
observed regularities in partly autonomous collective political 
choices, more or less freely arrived at. ! 

Even when a more realistic assessment of the importance of 
these mutually dependent links was made by taking into account! 
several important exogenous variables, political decisions (mea- 1 
sured by the a coefficients) were seen to be at least as determina-i 
five of political outcome as socioeconomic causation (measured 
by the b s) . 31 i 

Residual or random causes. In both the hierarchical and the re-j 
ciprocal theories examined above, each dependent variable was! 
assumed to be only probabilistically determined. The residual' 
terms (symbolized by the C/’s) indicated the lack of generality of; 
each explanatory equation. If, as appears to be the case the W 
terms account for roughly between 0 and 50 percent of the result-1 
ant political behavior," here too are important indications of self¬ 
generated or even chance behavior. Except when domestic group ! 
violence is involved, these “random” phenomena need not be co- ! 
ercive ones. 

Variable dependence coefficients. Besides the political choices ' 
and random terms, another aspect of freedom or indeterminacy in ! 
t e above mathematical theories is the dependence coefficients ! 
themselves. They were not predicted on an a priori basis; rather ! 
they were estimated from a particular set of data. As originally i 
specified, the arrow models indicated the existence and possibly ; 
the direction of selected causal links. The sizes of the related de- ' 
pendence coefficients are estimated only after the relevant data on ! 
independent and dependent variables were collected. Because ! 
these findings are only descriptive of a particular set of data (and ! 
certainly not a random sample at that), other dependence coeffi- j 

i 

sraaller esomate of the influence of those personal qualities and beliefs.” <«, p.< 8, ! 

Ihese rough estimates were derived from Figure 3 and Tahle 4 j 

multiple correlation coefficients (R».) for the mode'll reduced f£n ^ Jat omTdf ' 
cate that approximately 15%, 40%, 25%, and 60% of the variation of their respective 
dependent variables was "predetermined” by socioeconomic variables. If the remaining < 

the ab C ove S o att 5ot Uted Cq . UaUy “ P ° litical variables * nd also to random causes! i 

the above 0-50% variance estimates are approximately correct. ; 
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cients consistent with the theories described are quite possible for 
different sets of data and different time periods. Granted that the 
choice of the degree of open competitiveness in national political 
systems is partly determined by a variety of other political and 
socioeconomic factors, changes in the relevant mix of these causal 
factors represent significant differences between various kinds of 
political systems. 33 

This means that such models can be used to define and meas¬ 
ure structural changes associated with evolution or decay of po¬ 
litical systems. Changes in the very ‘laws” governing social and 
political relationships are, in fact, some of the most frequent topics 
of concern in the classical literature of politics. Causal models may 
even be used to explain why such gradual or revolutionary breaks 
with the past have occurred! 

A multiplicity of empivically acceptable causal models. As our 
analysis of the original and revised Lemer models suggested, 
more than one causal model may be consistent with a particular 
set of observed correlations. This may be true whatever the gen¬ 
erality of the correlations concerned. Specifically, our procedure 
of rejecting or failing to reject particular causal theories never 
succeeded in eliminating all but one possible theory. At best sev¬ 
eral plausible theories were partly discounted. 

Mathematically, it is easy to show that several causal theories 
make the same empirical predictions (the developmental sequence 
X -> Y -> Z and the double causal situation X Y -> Z, for 
example, both imply that r xz . y — 0). Since adding or subtracting 
variables and links from a model may or may not change the num¬ 
ber or nature of predictions involved, both mathematical and 
methodological injunctions to be tentative in advocating the truth 
of one particular model coincide. 

Theoretically, these possibilities allow for causal situations in 
which different but indistinguishable causal models are at work. 


33 The fact that the polyarchy-communism model gives several coefficient estimates 
with different signs when applied to economically underdeveloped nations is a graphic 
illustration of the more or less autonomous changes that developed nations have made 
in the determinants of the political characteristics, even if we assume the same arrow 


model (without signs) to apply. . . , . 

In a similar vein, Talcott Parsons has frequently argued that an impressive achieve¬ 
ment of most industrialized Western democracies has been the depolarization of the 
lower class—radicalism voting relationship. "Status polarization of the 1956 American 
presidential election, for example, was almost completely avoided. These findings do 
not imply, of course, that other determinants of voting behavior did not exist, bee 
the Parsons reference cited in 19 and the data in 11. 
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Even within one causal model, there may be several pathways of i 
change. Nonrecursive models of political competitiveness may in! 
fact summarize and cover up two or more recursive explanatory; 
models, each chosen by a number of national political systems. 

Determinism, and freedom. It should be clear that the incom¬ 
plete specification of the causal models presented in this paper J 
allows for a variety of choices and indeterminacies in the reality I 
being described. In addition, the possibility has been suggested j 
that political variables may themselves represent free and re-! 
sponsible collective political choices, some of whose partially de- i 
termined consequences we have tried to explore. Philosophically, i 
this perspective corresponds to the humanistic view that social j 
reality is only partly predetermined. Mathematically, the pro-j 
cedures investigated have helped make explicit the variety and • 
extent of the constraints and opportunities involved. ! 

V. Summary and Conclusions ' 

Within the variety of possible explanations for political events, j 
social causation focuses on generalizations with determinative j 
significance. Causal statements are also usually asymmetric in! 
character, pay attention to pathways of influence between causes > 
and effects, and tentatively assume other possible causes are being J 
safely ignored or controlled. All of these aspects of causal expla-! 
nation apply to recent social science attempts to abstract and j 
generalize relatively precise and comprehensive arguments using ! 
linear stochastic systems. It need not be assumed that these gen¬ 
eralizations apply independently of the historical context from 
which they are drawn. 

Within the causal modelling tradition there are again a variety ’ 
of procedures for theoretically coping with segments of political j 
reality. Many of them bear directly on philosophical arguments re¬ 
garding the nature of social reality. Models may be dynamic or 
static, stochastic or deterministic, dealing either with what we 1 
have called hierarchical or reciprocal influence relations. Major ! 
alternatives also exist as to the extent of isolation we assume, j 
ceteris paribus, regarding the systems of relationships being de¬ 
scribed. The price for logically insuring identifiable outcomes from 
a system of equations, whatever their degree of implied isolation, ‘ 
involves some additional specifications regarding the size or ab- * 
sence of some of the possible causal links, including possible links 
to exogenous variables whose causes themselves are not fully ! 
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assumed. We choose among such alternatives for a number of 
theoretical and personal reasons, but their scientific survival de¬ 
pends on resistance to empirical falsification. 

Within the causal modelling approach to political analysis there 
are a number of ways of accounting for human decisions and re¬ 
sponsibilities. Whether freely or coercively arrived at, individual 
and collective choices can be considered as themselves causally 
responsible for other political consequences. Probabilistic models 
confess from the start that specific outcomes cannot be exactly 
predicted, even if certain tendencies are known to occur. Statistical 
models with unspecified but partly restricted coefficients reduce 
but do not eliminate the degrees of freedom associated with causal 
explanation. These models allow us to study both the environ¬ 
mental limitations and the deterministic consequences of political 
decision-making. Moreover, they provide parameters with which 
to measure historically varying structural forms of political activity. 

This paper has only briefly illustrated several ways of combining 
increasingly available political data collections, partly inductive 
causal inference techniques, and deductive, testable theories de¬ 
rived from a rich tradition of qualitative political analysis. 34 Even 
though they have not explicitly introduced the time dimension, 
our test cases have implied several quite distinctive possibilities 
about the ongoing nature of the political process. The “competitive 
coexistence” model, as applied only to non-communist states, con¬ 
tained reequilibriating tendencies, which could also be labelled 
“negative feedback.” The “communism vs. democracy” theory, on 
the other hand, was modelled as a case of disequilibrium, de¬ 
stabilizing change accomplished by positive feedback relation¬ 
ships. The hierarchical model of political development stood some¬ 
where between these two reciprocal systems as a case of unidirec¬ 
tional change (assuming its coefficients remain positive) without 


34 That the inductive and deductive interaction of concepts and data using these 
methods can go beyond merely obvious relationships is indicated by the differences 
between the magnitudes of the simple correlations in Table 1 and the causal links in 
Table 3. Straight forward factor analysis of Table 1 would not have discovered the 
causal configuration of Figure 2. 

There are a number of inductive data analysis procedures, however, which are be¬ 
ginning to suggest possible causal inferences. If simple additive causal theories are 
appropriate for example, factor analysis may detect them. If, on the other hand, de¬ 
velopmental sequences exist like those in Figure l.A, one should apply other techniques. 
Robert Abelson has informally suggested comparing structural simularities between 
Guttman’s theory of the simplex and the equivalent pyramidal structure of recursive 
causal systems. At least for simple learning problems, simple developmental sequences 
and nearly perfect simplices seem to exist that are causally interpretable. See 25, 26, 
27, 31. 
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major feedback relationships. Undoubtedly the world of politics— 
of power, influence, and authority relationships—includes all of 
these possibilities. 

Perhaps the strongest arguments in favor of causal models like 
those we have discussed is that they help answer the central “who 
gets what, when, and why” questions of political analysis. Leaving 
more or less implicit many of the persuasion or reasoning processes 
accounting for the magnitudes of certain dependence coefficients, 
causal models can nonetheless help explain what maintains or 
changes the distribution of political power, social respect, and 
mental and physical health within particular societies. Interna¬ 
tionally, for example, the tentative analyses of the present paper 
have suggested that most European nations have high political 
participation levels because of their high levels of urbanization, 
literacy, and media development. The citizenry of countries with 
little urbanization, many illiterates, and few mass media are less 
fortunate in this respect. Free political institutions in the Western 
world are in part maintained by their tendency to decrease violent 
domestic behavior which itself finds expression within more radi¬ 
cal voting positions that are tolerated only by open societies. 
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The Representational Model in 
Cross-National Content Analysis 


RICHARD L. MERRITT' 
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Systematic content analysis as a tool for political not 

narticularly new. In primitive form it flourished dun g 
Scents of journalism and others ascertained attention patterns, 
as indicated by column inches or occasionally by word counts, for 
widevarieties^of newspapers and other P"ons ; ihey »m- 
oared patterns of attention to political events in the same pubh 
cations over time; they contrasted politick inter.at m 
politan dailies to that in small-town weeklies, th y p 
time with questions of appropnate samphng an 

te t“emdned for Harold D. Lasswell and his cogues, ^ 
ever to develop content analysis as a tool specifically for com 
parative political research. Their studies of attention patterns 
the “prestige papers” of Eve countries set standards of P** 151 ™’ 
Zil and objectivity that stirdents of comparative poetical be¬ 
havior have sought to emulate for well over * 

David C McClelland, Karin Dovring, Robert C. North R 
C Ingell T. Zvi Namenwirth, Richard L. Merritt and Ellen B. 
Pirro, and others have undertaken substantial content ^ys«; o 
aspects of the communication process relevant for the cro 

national study of politics. , . research 

The increasing importance of content analysis as 
tool, no less than the fact that increasingly large sums are ba g 
spent for studies using the technique, suggests that the tim 
come to pause and ^examine some of 

tions One important assumption concerns the nature of the rep 
resentational model” used in such studies, that is, the posited re 

tionship between observed and unobserved aspects of the commu- 
uonsmp uet-wcc i p pynress mv own concern 

ideation process. In this paper I shall express my « 

about developments along this line, paying p icu 

some of the cross-national content analyses of recent year . 

caveat at the outset; If my comments appear unduly pessimistic, 

1 Research on thU project ha, bee. ported b y .be Yale Politic. Da,. Program. 
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is because of the cavalier treatment given to this problem by some ! 
scholars rather than because of the fact that the problem itself is | 
irresolvable. The problem can be resolved. But, as I shall suggest, * 
its resolution will require both serious thinking about the meth- 
odology of content analysis and serious experimental work. ! 

Content Analysis and the Communication Process J 

The communication process, in Lasswell’s phrase (slightly ' 
modified), deals with WHY WHO says WHAT to WHOM and | 
with WHAT EFFECT—expressed schematically in Figure 1. Con- j 
tent analysis focuses on the message, or the WHAT in Lasswell’s i 


Figure 1 

THE COMMUNICATION PROCESS 
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formulation. It is the systematic, objective, and quantitative char- | 
acterization of content variables manifest or latent in a message. 2 I 
In principle any type of message may be content analyzed: inter- 1 
esting cross-national work has been performed on movies by ' 
Martha Wolfenstein and Nathan Leites; 3 on plays by Donald V. ' 
McGranahan; 4 and on doodling and designs on vases by Elliot i 
Aronson. 5 To date, however, most cross-national content analyses ! 
have dealt with written messages; and it is with these that this j 
chapter will be primarily concerned. ; 

Content analysis research entails a number of distinct but inter- | 
related steps. This is not the place to discuss its methodology at i 
great length; but a brief outline of these steps, and some of the ! 
problems encountered at each, will help to set the stage for some 1 
remarks on the representational model. ! 


PressR S01 T S 0ntent An ‘fy sis *» Communication Research (Glencoe: Free 
i j. 5 ' a/ S P a P er W1 ^ not with the nonfrequency type of content 

ITT? f. fHT'h L - , G “ t8e ' "Q»»m«ive and Qualitative Approach”' 
Contott Analysis, m Ithiel de Sola Pool (ed.), Trends in Content Analysis (Urbana: 
University of Illinois Press, 1959), pp. 7-32. 1 

Free 3 Pr^ss^l ^°^ fenstein and Nathan Leites, Movies: A Psychological Study (Glencoe: 

4 Donald V. McGranahan and Ivor Wayne, "German and American Traits Reflected 
ln j . Drama, Human Relations, I (1948), 429-55. 

• r u lll 0 ,w Ar a 0n , S0n ’ " The , Need for Acllievcm ent as Measured by Graphic Expression,” 
in John W. Atkinson (ed.), Motives in Fantasy, Action, and Society: A Method of 
Assessment and Study (Princeton: Van Nostrand, 1958), pp. 249-65. 
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The Formulation of Hypotheses. Ideally, the analyst formulates 
his hypotheses (as well as their alternatives) for testing at the 
outset of his project. Content analysis is useful only when the 
researcher has questions of a quantitative nature—how often? how 
much? how many? with what covariance?—that can be answered 
by counting the appearance of a limited number of content varia¬ 
bles in a given body of data. It is not particularly helpful if the 
task of research is merely to determine the timing of a sequence 
of events (such as the death of Stalin and the subsequent emer¬ 
gence of Khrushchev as the Soviet leader); it is of more use in 
trying to determine what effects the events had upon peoples 
perceptions, attitudes, and values (such as in messages communi¬ 
cated by the Soviet elite). The task of the analyst is to frame his 
questions so that quantitative data can answer them clearly, di¬ 
rectly, and simply. 

It must be added that there is usually considerable interplay be¬ 
tween the hypothesis-formulation stage and data-gathering stages 
in a content analysis, as in other types of research. It may even 
turn out that the most fruitful hypotheses do not emerge clearly 
until after the analyst has examined his preliminary findings. 
Other important scientific discoveries have resulted from studies 
based on hunches rather than rigidly formulated propositions. 
Sometimes research of this sort is inefficient; but the analyst who 
is sensitive to his findings may produce more interesting and 
meaningful results than the analyst who is blindly testing pre¬ 
formulated hypotheses. 

The Selection of an Appropriate Sample. The determination of 
what body of material could be used to test the hypotheses rests 
upon both the availability of data and the nature of the inferences 
to be drawn from the analysis. To get an idea of values current 
among Soviet elites, for instance, it would be ideal if we had 
access to the minutes of Presidium meetings. But such data are 
not at our disposal. In their absence, will the news columns of 
Pravda or Izvestia give us the information we want? Similarly, if 
our files of the most nearly ideal body of material are incom¬ 
plete, it may be necessary to work out a compromise: accepting 
the information loss due to missing data; estimating the nature of 
the missing data through statistical techniques already developed; 
selecting a second-best source of data; possibly even using avail¬ 
able files of the first choice as a check on trends present in the 
second choice. 
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The sampling procedure itself is a function of the type and 
amount of information needed to test the hypotheses as well as 
o what economists call the “opportunity cost” of securing a certain' 
amount of information. Appropriate sampling techniques include 1 
random sampling, using a table of random digits; systematic 1 
sampling, selecting every nth item in a series, or picking news-i 
paper issues on the 1st and 15th of every month; and slatified 1 
random sampling for bodies of material that can be broken down! 
into discrete categories. I 

Validating the sample, to see whether or not it is actually rep-! 
resentative of the universe of items from which it was drawn, can 
be problematical. For random samples, standard statistical tech-i 
mques (e.g., split-halves” technique within the sample itself or 
comparison of the sample with an independent sample from’the ! 
same universe of items) are readily available. Often in political ! 
research, however we cannot be quite certain of the randomness 
of the sample. Published foreign office documents, for instance ! 
are clearly not exhaustive of all documents in a country’s foreign 1 
office Compilers of such documents necessarily use some criteria ! 
of relevance in deciding which items to include and which to ' 
exclude. The extent to which a random sample of the published ! 
collection actually approximates the distribution of documents in 
the entire files is a question that demands an answer if we are i 
to creffit any content analysis of the sample. Statistical techniques I 
will tell us whether or not the sample is representative of the pub- ! 
fished documents, but correction factors are necessary to answer 
die more difficult question. Or else, some serious digging must be 
done m the particular country’s foreign office files. 

The Selection of Units of Analysis. Content analysts have gen- 

themes" O t type n * UnltS: s P ace >symbol* and 

themes. Determining the relative amount of space devoted in a ! 

message to a particular topic is often a good indicator of the com- ' 

mumcator s concern with the topic. If we view words as symbols i 

tor content analysis purposes, then establishing a list of relevant 

symbols is a crucial step. The experience of the RADIR project- 1 

wmld “?‘f ntrat .®? ° n “ Sy f bols su PP“ed to reflect trends in ' 
w rid politics with particular reference to changing attitudes ; 

toward the values of democracy, fraternity, security, and well- 1 
being -is instructive in this regard. Pool writes: ' ! 

Our own procedure in attempting to draw up a relatively valid list was i 
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to draw upon the best knowledge available and to use a long enough 
list so that the arbitrary deeisions about inclusion or exclusion would 
affect the relatively infrequent terms in the tarib of the ^ ' 

tribution, rather than more common words. To draw up the list 
called upon Harold D. Lasswell, for thirty years one of the leading stu 
dents of political movements and propaganda. The list°he drew up con¬ 
sisted of nouns, although the listed .oris 

appeared in other forms. Tire list was then subjected to the test of me- 
Anv expert by pure oversight, might omit some symbols of obvious im 
Mrtani. Our ^readers were, therefore, instructed to note and report any 
additional symbols that seemed appropriate to the list. 

Some of the more recent cross-national content analyses have 
dealt with themes. Angell delineated 40 value dimensions relevant 
for Soviet and American ideology (e.g„ “Mode of Ownership o 
Property”), and coded “elite” publications in the two countries 
according to several possible positions along each dimension. 
McClelland searched children’s readers in 41 countries for then 
concern with a need for achievement, affiliation, and power. An 
Stone's General Inquirer “tags” words (which can also be used as 
symbols) according to a predetermined list of concepts. Whic 
type of content variable is most appropriate for a particular ana y- 
sisrests, of course, upon the type of information needed to test the 
researchers hypotheses. 

Establishing Procedures for Counting. Perhaps the simplest type 
of content analysis uses a frequency count of the appearance o 
the content variables. Merritt, in his examination of the colonial 
American press, for instance, tabulated the frequency which which 
place-name symbols occurred.’ Angell counted the frequency with 
which Soviet^and American publications took positions along Ins 
40 value variables. In the former case, each reference o a umt of 
analysis was recorded; in the latter, no content variable could be 
coded more than once in any single communication. 

It is also possible to add vectors to frequency counts. The 
RADIR studies noted whether the context of the tabulated sym¬ 
bols was positive, neutral, or negative. The Stanford project on 
conflict and integration, directed by North codes communications 
along several dimensions, such as “good-bad, active-passive, 

PP ''RicLrd L. Merritt, Symbol, of 17M777 (New H.ve». 

Yale University Press, 1966). 
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strong-weak,” and “hostility-friendship.” 8 The Yale Arms Control 
Project coded French, German, British, and American editorial 
responses to arms control proposals along 5-point or 7-point scales' 
accor ing to the perceived specificity or diffuseness of the pro-! 
posal, its operationally or nonoperationality, the level of affect 1 
displayed, and so forth. 9 ! 

In either event objectivity requires that these technical aspects! 
of the content analysis be specified in advance, and that through-! 
out the analysis there be strict adherence to the coding procedure. 

A special problem that arises with computerized content analysis! 
is the transformation of existing material into texts that can be I 
handled by current programming techniques. In the case of the ! 
General Inquirer, this requires a certain amount of editing: | 
“g complex sentences down into simple thought-sequence j 
umts ; adding information not normally found in the computer’s 1 
memory drum (e.g., adding parenthetically “warm vacation place” : 
to references to Florida); clarifying the referent of ambiguous ! 
words (references to the singer George London and the city of ! 
London), and tagging” some words or combinations of words j 
relevant to concepts in which the analyst is interested (e.g., ! 
affect, ^ European economic integration”). The Stanford project ! 
utilizes^ evaluative assertion analysis” which translates messages j 
into a simple, three-element assertive format.” 11 Such transforma- ! 
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Cenpral T * P ^ t0 ” e ’ RoEert R bales, J. Zvi Namenwirth, and Daniel M Ogilvie "The 

si — 

. ,ue,tio "' Ho,sti ’ tur - th ' 

into T tJJ7±7Z A ZZiZ,‘ sgmsm m ,h • cormt “ ’*'** ^ »f J‘f. 

1. Americans are treacherous. 

2. Americans are aggressors. 

3. Americans are abetting Japanese ruling circles. 

4. Japanese ruling circles are corrupt. 

If we follow the normal canons of logic, the four assertions are most assuredlv not 
a reasonable restatement of the original sentence. ^ 
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tions, although seemingly simple, may contribute significantly to 
the level of error in any content analysis. 

Training Coders and Testing Coder Reliability . There is gen¬ 
eral agreement that coders need to be sufficiently trained and 
have enough understanding of the coding categories that two 
coders working independently will produce quite similar resu ts. 
This implies a necessity for working out coding manuals an 
other training procedures. When coding is being performed tor 
the analyst who originated the coding procedures, there is a 
marked tendency to postpone any effort to formalize the tech¬ 
niques, that is, to write them down in detail. If students an 
scholars at other universities are to be able to use the procedures, 
however, explicit and precise coding manuals are imperative. 
With time, as computerized content analysis becomes more fully 
developed, it will be possible simply to exchange data prepara¬ 
tion routines and computer programs that can be used anywhere 
by a novice in “cookbook fashion. 

Testing intercoder reliability is a relatively underdeveloped 
facet of content analysis. This is not to say that techniques to 
measure reliability have not been developed. Or even that B^e - 
son’s complaint of a decade and a half ago is still valid: What¬ 
ever the actual state of reliability in content analysis, the pubhshe 
record is less than satisfactory. Only about 15-20% of the studies 
report the reliability of the analysis contained in them.” 12 In fact, 
the most important cross-national analyses of recent years have 
been quite careful to discuss their problems of reliability. 

Two key aspects of intercoder reliability checks have nonethe¬ 
less received insufficient attention. The first is the question of 
acceptable levels of reliability. What does it mean when a con¬ 
tent analyst reports that his reliability score for two coders, using 
a simple percentage agreement test, 13 is .70? How much more 
useful or valid is the analysis if the percentage agreement is .80 
or 90? How does a reliability coefficient for the percentage agree¬ 
ment test compare with Scott’s reliability index or with a Pear- 
sonian product-moment correlation coefficient? Second, very little 
experimental information exists on the determinants of coder 
reliability. What role does the explicitness of the instructions m 


12 Berelson, Content Analysis in Communication Research, p. 172. 

13 For an excellent discussion of reliability indices, see William A. Scott, Reliability 
„( Co»«ntAnalysis: The Caae o£ Nominal Seal. Codi» 5 VM.c Opm.on Qmrt.rh, 
XIX (1 9S5), 321-2 S. 
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the coding manual play? What type of training and practice pr„- 

difficX^f m0S * 1J f ly 1 . t0 enhance coder r liability? Does^el 
difficulty of securing high intercoder reliability coefficients increase 

if themes rather than symbols are coded? What impact does the 
educational and intelligence level of the coder have upon his per- 

w^Mbef JTT n T tha ‘ attenti0n to ^ basic iss «es 
ould be fruitful for the future development of content analysis. 14 1 

' 1 

Inferences from Content Analysis 

or Ae S WHAT d T‘T* analysis focuses on ^ message- 

to So^ h D t 61 S fon " ulaUon - Oar reasons for waning 
to know the substance or form of the message may be various On 1 

toerX^ IeVd ’t m<5SSage ^ ^ 

terest to us: we may be curious to know, for instance, what the 

proposal orTeT *”** T ^ 3 particular ““ control 
editorials of -5 /™ 3U6nCy ° f Certain ^ of usage in the 

Ster^Sl t tt neWSP t PerS ' M ° re freqUent1 ^ are 

Sw iT J T SS T because we think i( contains dues about 

process “ SS 7 ° bSerVab,e ’ ^ ° f the communication 

Sometimes the content analyst is interested in the recipients of 
a set of messages-the WHOM of the earlier formula, pit of the 

ehto! “ f ", USlng P resti S e papers” to estimate the mood of 
the elite The ” S t t P °°f ^ 4,1656 news P a Pers are “read by 

mav bl looid ? Ue T n readershi P P° sed V Pool’s assertion 
may be looked at in two ways. On the one hand, we would like 

to know who the actual, as opposed to the intended, recipients 

o the message are. Who in fact reads the New York Times ? Of 

tofTn"* 0 IeadS the editorials? What percentage of the reader’s 

™ a i 0 rth n i r r ™ faf “ is d -otcd 

to perusal of the Times? On the other hand, what do the intended 

perforS'by ^ <« <h,t o„ interviewer bit, 

ns coders, see (iSTSl L S h2»">“» °f trein- 

Content Analysis for n Achievement n Affil’ ’ ,^° W T . t0 ^ earn t ^ e Method of 

»t.o„ of the Objectivity of the Me.ho/of Comlm »t S ’ „^ 2 fAi 

A number of important Doints runner a• 1 • ’ a ” PP- 23 4-41. 

problem of error stemming from the b , dls , cuss ^ m thls P a Per. One is the 

cation process. A second h ^ ° f the communi- 

levels of communication. Third, there is thtTismtTof JftTh ° n latent and manifest 
instrumental or representational junction for rh 1£t ler t ie messa Se performs an 
see Pnnl (oA \ t j ^ Junction for the communicator. On this last nlf 
I6n } ed -b Trends ™ Content Analysis, pp. 206-12. 14 pomt > 

Pool et ah, The "Prestige Papers,” p. 7. 
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recipients of the messages in fact read? What percentage of them 
reads the message? Which of them also read other (and possibly 
contradictory) messages as well? As will be suggested later, the 
answers to such questions lie not in a content analysis itself, nor 
even in the force of logic. Questions of actual as opposed to 
intended readership lie more properly with various types of media 
analysis through survey research. 

The issue of WHAT EFFECT the message has upon its recipi¬ 
ent is still thornier. Pool’s assertion that the “prestige” newspapers 
are not only read by elites but also “influence them raises un¬ 
answered questions about individual and group decision-making 
processes. 17 To be sure, it is important to know what is made 
available to a decision-making system. But it is even more im¬ 
portant to know what is assimilated or accepted for use by the 
system. For instance, suppose that a person reads a message 
telling him to vote for a particular candidate m an election, an 
then goes out to vote for him. Can we infer a causal relationship 
between the message and the ballot? It may be that the person 
happened to pick up the message as he was already on his way to 
vote for the candidate. Or it may be that persons likely to vote for 
the candidate are more likely than others to happen upon such 
literature. Or it may be that the message did indeed persuade the 
voter to opt for the candidate. At this stage in political research, 
determining the effect of communication upon attitude change is 
simply not a function of content analysis itself (unless the analys 
has independent validating evidence, such as that produced by 
experimental psychology, in which case the content analysis may 

be superfluous). 

Sometimes we are interested in determining WHO the com¬ 
municator is. This is the case in propaganda analysis where we 
assume that, if we know the source of a message we shall also 
know the extent to which it is likely to contain biased information. 
Lasswell and his colleagues used content analysis teclmiques 
during World War II with great effectiveness to determine the 
extent to which certain American publications contained news and 
editorial comment stemming from Nazi sources?* Discovering who 
the author of a message is has also been important in some types 


i® Harold D. L.sswell, “Detection: 

Harold D. Lasswell, Nathan Leites, and Associates The Language oj 
in Quantitative Semantics (New York: Stewart, 1949), pp- 
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of literary detective work. Recent efforts by Mosteller and | 
Wallace, using electronic computers, to infer who wrote which j 
of the Federalist papers are exemplary in this regard. 19 ' 

In cross-national political research it is usually clear who the 
communicator is. Sleuthing is generally directed to other ends. The j 
question of who the communicator is nonetheless raises in ele- ' 
mentary form the basic issue of the representational model used ' 
in content analysis research. That is, are we interested in the com- ! 
municator himself, because of his personal attributes? Or do we j 
examine his messages because he seems to be speaking for some I 
other group, such as the organization or culture of which he is j 
a member? Another way of looking at these questions is to ask 
what motivates the communicator: WHY does he transmit a par- ; 
ticular message? ; 


The Representational Model: 

Why the Communicator Communicates j 

Individual motivation rests upon a variety of subtly operating 1 
factors in the human psyche. Not the least of these is the nature • 
of the information that an individual has at his disposal when he j 
makes decisions. The amount of information available to the in- I 
dividual is limited by both chance and choice. He does not see, ' 
for instance, most newspapers published in the United States, nor | 
is it likely that he could manage to read them were they all de- 1 
livered on his doorstep. Every individual consciously and uncon- j 
sciously screens out certain types of information: he may de- i 
liberately choose to skip some sections of his morning newspaper, I 
such as the women’s page or the financial section; if he reads the J 
paper when he is tired he may miss some of the more subtle points ; 
expressed by editorial writers; moreover, experimental evidence ! 
indicates that some people literally do not see certain items that : 
disagree with their preconceptions. In contrast to the input of 
current information-values, attitudes, beliefs-there is also infor¬ 
mation stored in memory. In the individual’s active memory is 
much information that can be readily recalled, information ranging ! 
from the date of his birth to his perception of the course of events 
in Vietnam. More deeply stored information includes items of 
very low salience, such as the telephone number of his childhood 
residence, as well as such repressed data as painful emotional ex- i 


T , F ^ e /- C , k ^ osteller and David L Wallace, Inference and Disputed Authorship-. 
The Federalist (Reading, Mass., Palo Alto, Calif., and London: Addison Wesley, 1964 ). 
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periences in childhood. Individual motivation also rests upon a 
person’s perception of alternative courses of oction as well as their 
likely outcomes. Some behavior is purposeful: a person postulates 
a set of goals and then implements them as best he can. At the 
same time it must be added that random or habitual behavior 
often plays a role in the communication process, in determining 
what things a person will communicate and how he will com¬ 
municate them. 

In short, individual motivation is at best a complex mix of both 
current and stored information, perceptions of modes of behavior, 
and some nonrational factors such as chance and habit. If the 
task of content analysis is to infer a persons motivations from his 
messages, then what is needed is a sound theory bridging the 
gaps among motivation, verbal behavior, and other forms of 
behavior. Freudian psychology presents one possible bridge: the 
goal of the psychoanalyst (who was instrumental, by the way, 
in the development of content analysis techniques) is to try to 
account for individual behavior through the examination of a wide 
range of the individual’s messages. Some scholars have even tried 
to “psychoanalyze” historical personages by content analyzing their 
verbal messages and comparing these messages with those pro¬ 
duced by currently living personality types, whose characteristics 
have been analyzed clinically. 20 

The problem of motivation becomes still more complex as soon 
as we move from the personal to the public realm. Political psy¬ 
chology aside, content analysis generally deals not with the private 
utterances of a man lying on a psychoanalyst’s couch but with his 
public messages-the speeches he delivers, the pictures he paints, 
the memoranda and position papers he drafts, the editorials he 
writes, and so forth. If we are looking for the reason-or motiva- 
tion-for such communications, then we may examine either the 
man’s personality structure, or his relationship with the environ¬ 
ment, or both. The question to be asked is. Whom or what does 
the individual represent when he communicates? 

One possible answer is that he represents himself and no one 
else. He is seeking to express his own mind rather than pretending 
to be the spokesman for any group or culture. Such an answer, 
however, poses new questions: (1) How accurately does the 

20 For an excellent example, see Alexander L. and Juliette L. George, Woodrow 
Wilson and Colonel House: A Personality Study (New York: Day, 1956). 
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PROBLEMS 

rSar^of “? e ” ^ ** he <*o«e the 

p racuiar mode of communication that he did? (3) Whv did 

be PUbIfahed 0r the « ^ de d 

values attitudes andTh* 5 ? 6 "* We they “ a § reement with the 
’ ttitudes, and beliefs expressed in the message? Tf 

level of agreement were high, then we might argue that the 
mumcator regardless nf Lie • + tnat the com- 

perceived tn W? his intentions and preferences, may be 

of the meins ofT eSentUlg S °T’™ ^ (e 'S" those to —I 
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Sr ° h f 1118 mes T ? To what 

reZlTi 7 u : nces? Did r seek to tCzr 

gardless of intention or preference the extent- tn v 5 

the to which 

discover the degree of congruence^ P ' ’ h ° WeVer ’ “ t0 
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mrl7“ P ' T huS We ma y be interested 
views than as an 

ehte” think. Among the significant Jr" ZlZ 

this position are the following- m L \ , e it we take 

—tor's message 

eithlr S ° Ugbt hy 

fnWd P L ceT® 68 ^ ** 3 re P res “nlf model m^be 8 

I — 1is the “ 

how can we tell whether Ae commute XT™'™ 7 ^ 
to mirror group attitudes or whether he *o -v C0nsci0usI y tj 7 m g 
group to adopt new attitudes? In the latterTas^thl 0 PGrS f ade the 
tent of the message might deviate “ha ly lorn ZlZoZ 

oil 7un h 7 7 m ° St Pe °P le nXbers of morlTn 
one group, what is the mix of different gronn i.fl , 

relevant for any single individual’s message? When 1X^77 
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editorial in the Journal of the American Medical 
sure can we be that his views represent o£ of 

medical profession, which comprises by and large Protestants o 
Ando Saxon origin? (5) To what extent is any message that 
peram^communicates influenced by die overdl ^. ofjhmh 
he is a member? That is, how much by way of group o 
values creeps autonomously into every message? 

If the purpose of content analysis, then, is to extrapolate rom 
Ob" tables in messages to nonobserved motivational vari¬ 
ables two interrelated questions are crucial. First is the com 
municator perceived or assumed t°* be, rfpresentmg is ow , 

those of the group or groups to which he belongs, or those ot hi 
temU eS U feconV what mix of —us and uncon¬ 
scious elements goes into the formulation of his “ 

turn to some of the more recent cross-national content analyses 
see how such questions have been treated. 

Pool et al: The “ Prestige Papers Perhaps the most e ^ b ° r ^ ° f 
these studies is the Hoover Institute’s research on “ R 

tion and the Development of International Relations (RA 
C J) The published portions of the project analyze symbols 
of democracy and internationalism in newspapers from five cou - 
tries, covering the years from 1890 to 1949: 

Great Britain The Times ( 1890-1949) /1Q1R1Q491 

Russia NovoeVremia (1892-1917); Izoestm (1918-1949) 

United States The New York Times (If 0 ; 1949 ) 

France Le Temps ( 1900-1942); Le Monde (1945-1949) 

Germany Norddeutsche 

Frankfurter Z eitung (1920-1932); Volkiscner 

Beobachter (1933-1945) 

As justification for the decision to examine editorials in these 
newspapers. Pool writes: 

In each major power one newspaper stands out as an organ of elite 
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confidence one paper in any given country which plays the role of pres 
tige paper at any given time. F r 0t P res 

It fe h rearby e the a d r t , iS '7T IeSPeC ? “ g00d todeX °* eIite behavi “- 
v L u f ! and lnfluences them. In addition, it is produced 

part of *• dite “ d £•» 

The argument is plausible, but is it true? We know that the “pres- 

whattr of T 7;“ ive of something or someone. But of 

editorials on'te” 7 a 3 "”” 1 ^“P 16 ’ New York Times 
struede nrior fnT" ° f Amen f an Participation in the Vietnam 

calkdtdbt f 6 SUmmer ° 1965 at least ’ could steely be 
. of allt A g0v . ernment P° hc y or even of informed opin- 

overthelomTn 6 " 03 ^ 6146 gr0Up .! n ® !- U is doubtless true that- 
over the long run, and given a wide range of issues-the New York 

Times is closer to official” or “elite” opinion than any other sinde 

publication in the United States. Despite the faiLss of this 

assumption, it cannot be a fully satisfactory answer to the ques- 

lon raised above until empirical tests can show an acted las 

attitudes ° v b “ «*> distributee 

—p— - 

A second set of questions was raised earlier: Does the elite in 

whLeT A ^d e h P o reStige PaPerS ’ I*® “ the United States OT else¬ 
where? And how can we verify whether or not the editorials 

influence those who read them? In these regards inteLive tier 

■“ pl * •*» 4. 

d th q er a ° n iS ex *, ent *° Which the P re *ige Papers com¬ 
pare m their expressed values, attitudes, and beliefs with the 

separately with other^ stuXeTusfna^a’ - PP 't 7 - 111 tllis analysis I shall not deal 
Schramm (ed.), One Day in the World’s F^T^r ^ 1 A r modeL CL Wilbur 

°f Crisis > wUh Translations and Facsimile Rel'JdTf ™ % reat r Ne ^ s P a P^s on a Day 
sity Press, 1959); and J. Zvi NamenwTrth fnd rl ' T R (Stanford: Stanford Univer- 

? ““wt, G rha r v'ef/lT7 10 COn>M A ’ nalySiS . 

Research Center of TTrUniverskT of Mfct'* S,m,lar f° t . hose Performed at the Survey 
men and their constituents; cf. wlrren E. m£ “i *fTT , hetw ,^ n congress- 
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other newspapers in their own countries. Two projects currently 
under way at Yale University are seeking clues to resolve this 
problem. One, under the direction of J. Zvi Namenwirth, is con¬ 
tent analyzing three “elite” and three “mass” newspapers m t e 
United States, using the General Inquirer procedure. The other 
is investigating editorial attitudes in a wide variety of Frenc , 
West German, British, and American journals toward specific arms 
control events and proposals; a comparison of die prestige P a P® 
with the others will at least give us an idea of how typical th y 
are of the press of the different countries. 

Finally, the study of “elite” newspapers poses a problem similar 
• to that faced by students of community power structures who 
concentrate upon “community influentials.” A newspaper may 
enjoy a reputation for influence when in fact it is not mfluentia . 
Other newspapers, although perhaps somewhat less intellectual 
than the prestige papers, may be widely read by elite groupings. 
Or it is possible that a newspaper loses whatever influence among 
the elite it once had. If we continue to concentrate upon attention 
and value patterns in newspapers after they have passed their 
zenith, we may be deluding ourselves about actual trends m the 
country. But, then, how do we know when the star of an elite 
journal is falling and that of another publication is taking its place. 

Karin Dovring: Land Reform as a Propaganda Theme The 
focus of this study is the ideological coloration of demands tor 
land reform. Ten documents covering the period from 1891 to 
1952 are analvzed: 


Vatican 
Soviet Union 

Vatican 

Vatican 

France 

Hungary 

Bulgaria 


Papal Encyclical, “De rerum novarum” (1891) 
Lenin, seven pamphlets and speeches (1913- 

1919 ) 

Papal Encyclical, “Quadragesimo Anno” (1931) 
Pope Pius XII, Pentecost message (1941) 
Tanguy-Prigent (Socialist and later Minister of 
Agriculture), “Democratic a la terre” (1945) ^ 
Andras Sand6r, “Land Reform in Hungary” 
(1947) 

Vulko Chervenkov (Secretary of the Central 
Committee of the Bulgarian Communist Party, 
later Prime Minister), speech, “Tasks of the 
Co-operative Farms” (1950) 
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Italy Government statement on the need for land re- 

form, La Relazione Ministeriale” (1951) 
Italy Giuseppi Medici (Head of Ente Maremma, later 

Minister of Agriculture), “II Contratto con i 
Contadini” (1952) 

East Germany West German Bundesministerium fur gesamt- 
deutsche Fragen, pamphlet, “Auf dem Wege 
zur Kolchose” (1952) g 

These documents are searched systematically for symbols of 

other^ar° n ’n f f ° r Certain vaIues > and finance to 

other values. Dr. Dovrmg tabulates the frequency of the symbols 

grouped into themes; determines their function (that is, Aether 

notes^h ° f ldentification > de mand, or resistance); and 

notes whether their contexts are favorable or unfavorable. 

s far as her representative model is concerned. Dr. Dovring 
writes that the ten documents have two things in common. Firsf 
they are regarded as responsible statements justifying agrarian 

World Wflr^T^ “ r ?, S P ective countries after the Second 
World War And, second, they claim to deal with agrarian or 

social questions, but at the same time they are all living state¬ 
ments of current ideologies in conflict today.” 23 Hence the mes 

SJeT Sent ” ° fflCial ° Pinl0n “ ^ " 0ffidal and ’ 

Ffrsrnf 3 !^ 65611 ^ 011 . 31 ””, deI P ° SeS 3 number of questions. 
First of all, the principle underlying the selection of particular 

documents is not stated anywhere. This is particularly noticeable 

with respect to Communist proposals. Of all postwar statements 

” and Eastern Europe (such as those cited in other 

parts of the book of which this study forms one chapter), for ex¬ 
ample, why were those by Sanddr and Chervenkov rather than 
others included in this survey? Granted that they are authoritative 

CzJhoL v rePreS « ntattVe o° f knd ref0m measures 
Czechoslovakia, or Rumania? Similarly, is it true that the best 

statement of East German land reform measures and proposals is 

a pamphfet published by a West German government agency" 

Why not Wilhelm Pieck’s speech in 1945 entitled “Junkerlfnd ta 

cTT*?’ “ F ° ,te 

(The Mihoe,' utf™'^r.nm 
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Bauemhand!” or Walter Ulbricht’s lengthy chapter on “The Demo- 
cratic Land Reform”? 24 

Second, is there any measure of functional equivalence among 
the different messages? That is, do they serve the same function 
in all the societies included in the survey? Is it realistic to compare 
a series of statements by Lenin during prerevolutionary and 
revolutionary times with an Italian government statement on the 
need for land reform? Since three of the ten documents stem from 
the Vatican, it seems legitimate to question what role the Vatican 
plays in European land reform. How much influence does it exert 
over individual (noncommunist) governments? Or are the three 
messages included to give an estimate of a changing mood m 
European intellectual circles? 

Third, it seems necessary to look closely at the function of a 
particular message in its society. Put another way, why did the 
communicator transmit the message? Was it merely to announce 
a new policy generally acceptable to the population at large? Or 
was it to persuade intransigent opponents of the need for such a 
policy? Or was it an instrumental message designed to achieve 
other ends (e.g., promising long-run support to land-hungry peas¬ 
ants in exchange for their support of other controversial meas¬ 
ures)? Perhaps the clearest question arises here with respect to 
Lenin’s statements, four of which were pamphlets written m 1913 
and directed to intellectuals, and the other three of which were 
statements made to elements of the peasantry in the esperate 
years of struggle, 1918-1919. It may turn out that several different 
messages communicated by a single individual are more similar 
(regardless of intent) than messages emanating from different 
communicators; for studies of this sort we need some indication of 
the magnitude of these differences. 

McClelland: The Achieving Society. McClelland analyzed the 
content of children’s stories from 23 countries for the period 1920- 
1929 (centering around 1925) and from 41 countries for the period 
1946-1955 (centering around 1950), using the analytical frame- 
work developed for projective tests to measure the need for 
achievement (n Achievement), for affiliation (n Affiliation), an 


24 Wilhelm Pieck, "Junkerland in Bauernhand!” speech on September 2, 1945, 
in September 6, 15>45; Walter Ulbricht 'Dte demok r a tl sche 

Bodenreform,” in Zur Geschichte der neuesten Z eit (Berlin: Dietz, 1955), V o . P 
1 pp 208-38. The use of a West German source, by the way, puts Dr. Dovnng m 
the^awkward position of having to invert all her findings on East Germany. 
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for power (» Power). Data on n Achievement levels in these conn 

tnes were then correlated with two indices of mrvrl C U 

growth. indices ot modem economic 

stTiinr zzftszg 
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matelv 1 to 3 \ / 3 ) tl • gr up of countnes is approxi¬ 
mately i to 3.) (3) There is a bias in favor of economical^ fl d 
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capita from about 1950 to 1960 was 3.41 LCl J , J e , r 
countries for which we have data, as 

34 excluded countries for wTiioL 4 . r ., , , f cent ror 

kss :sv” s zz z 

his main hypothesis. S ’ “ 38 the P re * ctive value of 

scrapping all of^cCleS^ 
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H»roi?S! Jr, K»rf W. Deutsch, » d 

Yale University Press, 1964), pp. 149 - 61 . * * Socw/ Indicators (New Haven: 
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unsuccessfully in the Library of Congress for such readers,. Mc¬ 
Clelland wrote to the ministries of education ofthevarious c ,du ^ 
tries, asking for three “widely-used” readers dating from 1925 a” 
1950 Where responses were not forthcoming, he relied upon book 
dealers in the countries and upon private sources. Such a sampling 
process may have been dictated by necessity, but we must be 
absolutely clear about the fact that it could not produce (except 
by the sheerest of accidents) a random sampling o rea ers u 
£ die countries during those time periods. We have no way of 
knowing either how representative the selected readers are or how 
“widely” they were used. And information of this sort is vital 

an evaluation of the project’s findings.” 

How appropriate are childrens readers for an assessment o 
societal values anyway? As will be seen, not even McClelland is 
sure of the answer to this question. In fact, m the paragraphs a 
follow I shall use his arguments and counter-arguments exten¬ 
sively. The difference lies in the conclusions we reach. 

McClelland first of all rejects the simple notion that the stories 
in the readers “represent” solely characteristics of them authors 
personalities. While recognizing that this may be true m part, he 
sees the author not as a creator but as a mediator. a 

transmits aspects of the culture to a particular a udi<*<^ cMdrm 
and the adults having to do with the education of children vv 
will decide whether their stories will be included in the textboo 
or not ”' T Such a position raises two problems. First, there are 
several points in the process of transmitting values at which errors 
of one sort or another can creep in. The author, for instance, has 
a wide stock of folklore available to him when he sits down to 
write. On what basis does he make his selection of stones to 
retell? Can we safely conclude that his vision of cultura va ues 
is reasonably accurate? His manner of writing may empb“f « 
some values in a story and relegate others to a minor ro e. 

-It is significant in dm regard .ha. McCIelland “L^tc ncyjo 

T7 i0n V hen r fo““m”“ ! « be'fiLdTtto ov.racbiever despite lo, . 

STnlt «« IT*, readers fie 

suggests that they may not have een^r P ^ finds that Turkey has rapid 

stories rather than schoolbooks h f i i b ut l ow n Achievement among 

economic growth that die schoolbooks may be atypically 

rhe Achieve Society (Princeton: 

Van Nostrand, 1961), PP. 1°1> 266. 

27 Ibid., p. 75. 
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T by ‘ h0Se Wh ° must decide t0 ad °Pt^ 

or not to adopt for classroom use a particular set of stories? IT 
os Angeles teachers learned in the decade after World War II 1 
when they decided to adopt UNESCO readers for Their sh^nL : 

issue q T S T° n ° f ^ Cnteda ° f selection can become a very touchy ! 

cons1derIaon C Th7 mty " ^ thiS ^ 1 

consideration-the process by which textbooks are prepared for • 
school use is more complicated than McClelland suggests There : 

mftteroTi r,i; at h / ]p *° detennine ^S a «r 

Tubhshl td 1 ” ay Ch0 ° Se 3m0ng: between ! 

. nd P artlcldar writers; copyrights and copyright in- ‘ 

diffSationTe gTo” T/ “Orations; regional ! 

umerentiabon (eg., how likely is it that a textbook manufacturer I 

be abIc t0 se]1 to a school board in Mississippi a reader which 1 
pictures white and Negro children playing togOthJ?He^nal ' 
elationships between salesmen and school board member! in nar i 
hcular areas; and so forth. Even the reader most representaTve rf ^ 

route C3n fdI in *° a “mcwhere along this j 

McClelland then raises “the theoretical issue as to whether ! 
Mtasy reflects what a person has or doesn’t have.” 2 ’ Although i 
there are good reasons why either alternative should be true I 
am not aware of any research that settles the question finallv I 
Research on n Affiliation perfoimed by McClelland’s associates fa ! 
instructive m this regard. They asked two groups of college 
freshmen-the first comprising men who had just been acceptfd 1 
into social fraternities, and the second consisting in men who^ad I 
been rejected despite their desire to join a fraternity, and Zho had | 
afterward expressed their disappointment to the dean-to write ■ 
short stones describing what was probably happening in a picture ! 

flaS iffif- before d 1 ®™ ° n a Screen ' K btmed out that the level of ; 
n Affihahon in stories by rejected subjects was almost twice Js ! 

igh as that m the stones of their more socially accepted col 
leagues. These data cannot answer the question conclusively but ! 
, 6y ° ™SS cst th »t social rejeclion is associated with n Affiliation ' 
(unless there is an intervening variable that accounts for both) ' 

^toid t ns T p “ non - afflliation (social - d » 

ttiliation-hods true for a person’s sense of having achieved ' 
something and his » Achievement score is something that remaTns ! 

28 Ibid., p. 76. ! 
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to be seen. As McClelland points out, “it is impossible to decide 
on theoretical grounds which of these two alternatives is most 

likelv ” 29 

Another major problem pertains to the values themselves as they 
are portrayed in the children’s readers. Even if we assume that the 
author has played the role of cultural bridge’ 
ask along with McClelland, “of what or of whom the values are 
typical. Do they represent values typical of the culturc as a w 
or of specific subcultures (e.g., the intellectual elite)? Do *ey 
represent the values actually held by most of the people ; m ■the 
culture or just the “best” values that they want transmitted to 
children? Or do they even comprise a set of values that a ““*7 
of education is trying to inculcate in a population. G " en 
sampling process used, there is no simple answer^ o u 
tions. McClelland has pointed to several problematical examp es 
the fact that Algerian and Tunisian readers, althoug ea “S'” 
North African lemes, were printed in Paris; the fact.that Sowet 
readers of the 1920’s dealt with values clearly not held by t 
masses of peasants; the fact that Argentine readers m the post- 
World War II years had “a very strong political slant m that mo 
of the stories glorified the then dictator Juan Peron 
not difficult to think of others. American civics texts, for example, 
emphasize the value of individual political participation but as 
may be seen by the level of participation (other than voting) in 
any national or state or local election, this value is not shared in 

practice by most Americans. . .1 

P McClelland’s efforts to date to discover of what or of whom the 
readers are “typical” in a national culture have not been very 
satisfactory. A national sample survey of Catholic and Prot ® st “‘ 
students in the United States offered some confirmation of the 
thesis that values in readers are typical of more generally held 
cultural values. Less representative but cross-national . surv y ■ , 
however, have held out less hope. In countries where readers wer 
low on the n Achievement scale, students scored lg m P 
tive tests for levels of n Achievement, and vice versa McClelland 
rejects the conclusion that such findings cast doubt upon the 
validity of reader n Achievement scores as indicators of culture 
values. Instead, he suggests that these findings show that 


29 Ibid. 

30 Ibid., p. 101. 


SOLUTIONS TO METHODOLOGICAL PROBLEMS 


65 


reader cores may not reflect » Achievement levels in any group of in¬ 
dividuals m the counhy: in this sense any comparison rJith indirid^l 
scores is invalid or unrepresentative. Rather, the reader stress on achieve- 
tendl m °'\ rer "T nt so ™ eth ‘ng more like "national aspiratiom"-&e 

aAtZent» 0P “ P ^ ™ "d ™’ 8 teXtb °° ks > 10 * hfak ab “' 

{ 

In short, after data or their absence have failed to confirm other 
mterpretahons of his representational model, McClelland falls 

totafijrfth T ClU T n hat ? 6 readerS muSt re P resent the 

totality of the culture which produced them. ! 

final U st!!e Clel l and ^° bviousl >' not ha PPy with this conclusion. His | 
final statement on his representational model is illuminating: 

Comparison of reader n Achievement levels with levels obtained from 
1 «e U meiinf R I-"*®”* “ ‘° just what the Ld- 

measurinTan^?' f “ eVm * hT ° Wn Some doubt “ wbeth « are 

measuring anything of importance, but in the end, the proof of the ! 

Wll rl ” 8 1 S m 4116 6ating: d ° they enable u s to predict which countries 1 
will develop more rapidly economically ? 32 i 

That the readers do enable such predictions-at least to McClel- 
ands satisfaction if not always to that of others-does not get 
around the fact that his argument begs the key question of ie 
representational model. 7 1 UI ne 

Angel/: Social Values of Soviet and American Elites. The studv 
ocuses on social values held by segments of the Soviet and 

ta theTrrir 5 ' 7 pa 7f ular attention P^ ‘o values important 
in the foreign policy making process. Values are defined as per¬ 
ceptual images, that is, elements “of the good life as seen byrthe 
pemon who cherishes” them.- Of the six elite groups identified as 
mg most relevant for the two societies, four are fairly compara- 
. military, scientific, cultural, and labor elites. The remaining 
two elite groups in the Soviet Union are the govemment-Partf 
ehte and the economic elite, in the United States the cosmopolite^ 
elite and the provincial ehte. Angel] selected the folIowingTibli- 

cations as representative of the various elites: S 1 

rAS ie «i L ,r, s T„’ tt>t “ tte r* **■* 

of the mood or motivational level of a . readers are more of a reflection 

whi 32 h ri s , affccting the next generation.” Ibid., p. to] “ an educationaI influence 

Ibid., p. 79. * i 

of Elite°Media/’ folt’al 1 C ° ntent 
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United States 
Cosmopolitan 
Provincial 

Labor 

Military 

Scientific 


Cultural 
Soviet Union 

Government-Party 

Economic 

Labor 

Military 

Scientific 

Cultural 


New York Times, Fortune 
Nations Business, American Bar Associa¬ 
tion Journal 
American Federationist 
Army, Navy, Air Force 
Science, American Scientist, GeoTimes, 
American Institute of Biological Sci¬ 
ences Bulletin, Chemical b Engineering 
News, Physics Today, Bulletin of the 
Atomic Scientists 
Saturday Review, Harpers 

Pravda, Kommunist, Voprosy Filosofii 
Voprosy Ekonomikii, Sovetskaia Torgovlia, 
Planovoe Khozaistvo 
Sotsialisticheskii Trud 
Krasnaia Z vezda 

Vestnik Akademii Nauk, Vestnik Vysshei 
Shkoly 

Novyi Mir, Literaturnaia Gazeta, Teatr 


The period covered is from May 1, 1957 to April 30, 1960—a three- 
year period of relative peace and quiet in Soviet-American rela¬ 
tions. The frequency of positions taken by or attributed to the 
various elite groups in these publications was tabulated mter- 
country variations in positions were systematically examined, an 
some attention was given to intra-country variations in value 

positions. f , , 

Three types of problems arose in the selection of the sample. 
First, there was the question of the size of the sample for each 
publication. Angell decided to allot “roughly equal reading time 
... to the periodicals for each of the six elites,” but, if anything, 
more effort should be put into analyzing Soviet journals since we 
felt we already knew much more about American than about 
Soviet society.” 34 He does not give information about the reasons 
for the particular sampling design chosen-e.g., every 22nd dai y 
issue of Pravda but every 43rd daily issue of the military journal 
Krasnaia Xvezda. A second problem was the paucity of value 


34 Ibid., p. 336. 
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statements in some sources. This was particularly the case with 
American scientific journals: 7 m 

I 

It was not intended originally to use so many of them, but when it be ' 

rieTd so P rtri en ‘t and Tke Ameri ° an were going to 

oWl Vfj "“S necessary to find more specialized scientific perfodi- ' 
cals that had some editorial or editorial-like material The Bulletin of /(,„ 1 

Atomic Scientists is much the richest in the kind J „ dle7 ’ 

nort°thJ a Sm f SampIe 7“ S taken here because the scientists who sup- ! 
L of AeTrofe'SonT ^ ^ “ n0t wh °% j 

Third, the question of what to read was important. In some! 

cienTtoanid 65 & f ’ Times ' ) 14 was thought suffi-' 

items f e u rfh-t 7 exce P‘ for ^“ific categories of 

( .g., obituaries, articles of an exclusively historical char- 
acter not: gntog value preferences in the period studied’) eve™- 
ng in the Soviet publications was included in the analysis ^ i 
The representational model in Angell’s study is, in essence 
merely an extension of the “prestige papers” idea except S i 
explicitly rejects any inferences about the readers of th/joumals ! 
Angell nonetheless compounds the problem faced by Pool and I 

otrrr ng “'f “ aI P ubU “ - representative ! 

.„. P f te eIltes m So ™* and American life. Even if we are 
willing to accept the New York Times as indicative of “elite aW i 

AssTciJn T y W f- ^ “ nwiIIin g t0 “g"* tha t the American Bar I 
Assoc,at,on Journal is indicative of any attitudes other than those i 

the^sWh n t Wntm f S editoriaIs - n may still be possible to view I 
the distribution of values in all the American publications taken 

together as somehow an indicator of values held by a broad ‘ 

tion Ts° f encan elite gfoupings; similarly, the entire collec- ! 
hon of Soviet journals may give us a better idea of Soviet elite 
values than would one ‘prestige paper” by itself. 36 6 ! 

Angell’s analysis raises an interesting question about intra¬ 
country variations m totalitarian societies. He writes- ! 

■ | 

enough free play has developed near the top of Soviet society since the | 

pling design thatrife ?“ ”J“"* ““ 1 - 1™! ^ 

somehow according to their "representativeness ” h * h J0UrnaIs were we lg hted 

» SrtuttTl' S ™' *•“ ■?’" “. »*™-coaMiy than ' 

analyses of within-nation differences. & ^ present mter esting possibilities for 
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death of Stalin for elite differences of value to come to Ught in Soviet 
periodicals. It is true that the explicit differences on the Soviet srde are 
not striking, but in a good many of our value dimensrons they are real. 

To what extent may we expect publicly expressed attitude^ be¬ 
liefs, and values to be different in various Soviet elite publica¬ 
tions? If we expect uniformity (the monolithic hypothesis) an 
discover differences, must we revise our estimate of “ endave 
press? Or if we expect differences (the pluralist hypothesis) and 
find them, is our Ration confirmed? Unfortunately, if aU we 
have access to is Angell’s data, we must respond negatively to 
both these latter questions. What is needed to test the ahcrnativ 
hypotheses is time-series data to estabhsh the changing imi 
acceptable differences in a totalitarian society: we need to know 
whether the level of intra-country variation is now greater, lesser, 
or about the same as it was during Stalin s heyday. 

Finally, in reviewing Angell’s study, we must ask if it is prope 
to accept for analytical purposes views attributed to one elite by 
members of another. Before we can consider such evidence it is 
necessary to know, for instance, how reliable an estimate of the 
military mind” is likely to be found in the editorial columns o 
the Bulletin of the Atomic Scientists, or how accurately wntmi 
CeoTimes will reflect the mood of Americas cultural elite. For¬ 
tunately, Angell clearly recognizes the danger of misinterpreta¬ 
tion due to such attributions, and even keeps the attnbutwl value 
positions separate from the direct assertions in his tables and 

analyses. 

Validating Representational Models: A Plea for Future Research 

The bulk of my remarks to this point have been critical. In con¬ 
centrating on the weaker aspects of some of the recent cross- 
national Ltent analyses, however, I do not to suggert to 

the analyses themselves have been without merit. But the fact 
that their results have been both interesting and fruitful in terms 
of generating hypotheses about political behavior does not hide 
the § fact that their theoretical underpinnings and assumptions have 
often been insufficiently examined. Perhaps the time has come for 
content analysts to look again at their research ,) ust as aurv «y 

researchers in the 1940’s and early 1950’s turned their attention to 


37 Ibid., p. 33 5. 
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some procedures that many of them had come to take for granted ! 
The following list of specific and feasible research tasks is certainly 1 
not exhaustive, but it may serve as a beginning. 

1. We need intensive analyses of a wide variety of publications 
m each country that is of interest to us. What we want to know I 
is the range of attitudes, perceptions, and values expressed in ! 
these media on a variety of variables over time. If we had such I 
information, it would be possible for a researcher to specify that! 
tor a particular analysis he wants to examine a publication that is ! 
let us ^ pro-labor but otherwise conservative on economic mat- j 
ters and liberal politically; or a conservative political journal with ! 
avant garde attitudes on culture; or a newspaper that, on a given 
range of variables, is “typical" of all newspapers in the country ! 
or a journal that became progressively more liberal over time. ! 

. It would be possible to compare such empirically-based de- ! 
lmeations of media characteristics with the ratings of knowledge- ! 
able judges on the same dimensions. For some purposes it ma v : 
turn out that judgmental ratings are sufiSciently accurate; or we i 
may find that judges with certain types of background are most ' 
qualified to rate the press on certain dimensions. i 

3^ We need to pay closer attention to readership surveys, such I 
as those conducted for advertising purposes, to get an idea of the ! 
audience of particular publications. It should be possible, for ! 
instance, to find out approximately how many people with high > 
professional and socio-economic status report that they read the 1 
New York Times , Nation, or Life . Looked at from the other direc- ! 
tion, it should be possible, after getting estimates of the number ! 
o members of a particular elite grouping, to determine what per- i 
centage of that number reads a particular journal P i 

4. Closely related to this is the need to utilize both intensive and 

extensive survey research to determine the extent to which I 
peoples views—attitudes, perceptions, values-parallel those pre- 1 
sented in the publications they read regularly. Note that here I 
am asking for the delineation of an empirical relationship rather ! 
dian inquiring into causality (i.e., does the reader read die nub- ‘ 
hcahon because it reflects his views? or does the periodical shape 
his views? or is the true relationship a bit of both?). P 

5. It would be useful to have corroborative evidence of an 
Rgg-egate nature for a content analysis. Examples of such evidence 1 
might include public opinion anaylsis, elite interviews, content 1 
analysis of other types of messages, and other indicators of be- j 
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havior. 38 Ideally, of course, for any research project we would 
like to use all indicators available, bringing them to bear upon the 
aspect of behavior in which we are interested. For instance, it we 
are interested in the development of Western European attitudes 
toward arms control and disarmament, all the above types of evi¬ 
dence could serve as independent indicators of different aspects 
of Western European decision-making processes. / W tt V 

6. Turning from inferences about antecedent events (WHi, 
WHO) to inferences about consequent events (WHOM, WHAT 
EFFECT), there is a crying need to integrate social psychologica 
data on attitude change with the theory of content analysis. at 
is the function of an attitude for an individual? What conditions 
maximize the impact of a written communication on a persons 
attitudes, perceptions, and values? (For example, what ro e oes 
the person’s previous level of informtaion play? Or his commit¬ 
ment to a particular ideology?) How important is written com¬ 
munication relative to an individual’s face-to-face communications 

network? . . . . „ 

7. It should also be possible to undertake intensive interviews 

with a wide variety of elite groupings to ascertain the extent to 
which their members report that they are influenced by particular 
media. Do they look to these media for authoritative information 
and ready-made views? What other sources influence them? Are 
the particular media in which we are interested more or less im¬ 
portant than the other sources of information and attitudes. 

8. For cross-national research the question of functional equiva - 
ence is crucial. The task is to find sets of messages that perform 
approximately the same function in all the societies included in 
the analysis and that are capable of being analyzed using the same 
set of research tools. Some differences are ones of format: The 
New York Times and The Times of London have clearcut editorials 

38 One approach is suggested by Ole R. Holsti and 

S5 5HTv^Ya^tnWerS 

other questions arise. If it is true . . . , tlmt led un to the outbreak of 

ceptual and the other data in the six critical ^ with the 

World War I is it also true that perceptual data oncaafbc^^ l f ^ ^ # 

"hard” data in times of peaCC ?? “L ^cumstances, why bother with the arduous 
high degree of correlation in . ^-u-sfeld’s concept of the interchange- 

task of collecting the perceptua ata. Using clie r in ter ms of research time 
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on a variety of issues, for instance, whereas Le Monde usually 
carries only one editorial (on foreign policy) and the Frankfurter 
allgememe Zeitung relies mainly on signed, editorialized news 
columns. Other differences are more fundamental. What is the 
level of party partisanship in the press? What is the national ethic 
about an independent and “objective” press? How widely read by 
he opposition are “prestige papers”? What effect does the literacy 

wT.° f a i C ° U ] 1 ^ y W Up ° n the nature and influence of its press? 
TOat is the difference between “elite” and “mass” newspapers in 

different countries? What is the difference in press attitudes be¬ 
tween countries with regional newspapers and countries with na- 
£ 0 * a ! newspapers (e.g., between the United States and Great 
ntarn). How comparable is the press in totalitarian and demo- 
eratic societies? What are the bounds of pennissible disagreement 
m the different media of a communist state? What effect does a 
change in government have upon the press of various countries? 
Questions such as these, I might add, are fairly simply answered 
tor European countries, but grow extremely complex when we 
tiy to account for media in non-Western areas. 39 

The research tasks that I am proposing are basic. They will 
not produce glamorous results that will dazzle the eyes of our 
CO eagues and lay a golden path to foundation support. But they 
will serve to give a firmer foundation to cross-national research 
using content analysis. 
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2. Statistical Techniqu es ! 

Introductory Note 

In carriere ouverte aux talens”-Napoleon \ 

potocar s U ^°F erS t S6Ven , StatiSticaI applicable hi 

involving obSJXfTJr 1 rato < nen^ liC S u b ch „ TT^ 
“ Cluster analysis or numerical taxonomy be IpS 

group's •‘Sral , hyP ° th , eSeS influence relationships i„' 

£oups. General linear hypothesis is a technique for discerning 

causal inferences. It may. the author notes, test die hypothesfcTaf 
religion is a significant variable in explaining political contribu 
bans by corporate executives. To usean examptTuLested W t 
vot r r K • ^ !t may test 1116 hypothesis that a high level of 
tures (oTthe ‘ ** -P-di-1 

Tr ate f ° r 

linear regression. His eS,om* ! 

an quantity, suggests analogous political applications nT i 

“> “-it- 

ponse surface analysis includes a variety of statistical nm ' 
cedures which may be utilized to discern “theC j 

As sembly American ’ PoliJ^dtcfen^evtew LVIIwT^ ° f , ' Conflict in the General j 
S. Sidney Ulmer uses the (September, 1964), 642-57 I 

behavior of judges. See Ulmer, "TheVnalysh^f^Beha^T? 6 * 1 ^ McQuitty t0 ana Iyze 1 
Supreme Court,” Journal of Pofc, XXH n ? e United States | 

vss* Re,e -: 

°S? ‘Z l Z-?, d v ° u “ c ' (N ™ Y °* 1»«). P. «. : 
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ditions for a system.” The canvassing resource allocation pro - 
lem, discussed by Kramer in this volume, comes immediately 
mind. The techniques discussed by Kossack are designe o yi 
optimum operational conditions to inform the decision-maker in 

planning and in the use of resources. Kossack 

P Classlcation techniques, the final procedures noted by Kosmc£ 
have a variety of practical applications to politics, particularly 
Afield of administration. The author uses the 
admissions problem in a college: evaluating prospective students 
by" of high-school grades and a battery of test scores The 
problem requires that similar data on a previous population of 
students and actual college performance by these students be cor- 
related ^norder to estate the likelihood of the -^elh 
population completing the college curricula. A useful and realistic 
classification system may be constructed when the model p- 
plied to the earlier experience. 






Statistical Analysis, The Computer 
and Political Science Research 

CARL F. KOSSACK 
University of Georgia 
Introduction 

As our society becomes more and more complex, the need for j 
sophisticated methods of analysis through which one can study 
problems associated with various activities becomes more and j 
more evident. This need is present not only in the biological and! 
physical sciences but also in the social and political sciences. In 
fact, one can successfully argue that in the so-called soft-scienceJ ! 
the need is increasing at an even more rapid rate than in the hard-!; 
sciences. Our social and political systems have become increas-] 
ingly more complex, with higher order interactions playing a role | 
today that was unknown in the earlier rural society of the past. V 
Since one of the approaches considered by statisticians is that ' 
of inductive reasoning—making valid conclusions from evidence 
contained in observational data-and since mans understanding 
of most political phenomena is such as to exclude the possibility | 
of mathematical or stochastic modeling of the system, it seems 1 
appropriate to consider in this paper some of the recent advances ! 
that have been made in statistics, particularly as they relate to ! 
problems encountered in political science research. If one associ- ! 
ates with these advance statistical applications the power of | 
modem digital computers, it is now quite feasible to consider ! 
applications which one could not even dream about some ten ! 
years ago. One should also note that modem computers provide I 
a technological bridge which can make available to the researcher j 
analytical techniques which far exceed his in-house capabilities. ! 

In considering the phases through which a scientific investiga- i 
tion usually advances, one possible classification of these phases 1 
would be: r I 

| 

The Data Analysis Phase -During this phase of a scientific in- ! 
quiry, the researcher acquires observational data associated with ' 
the phenomenon being studied and “processes” these data in an 
attempt to discover important or interesting relationships within ; 

77 1 









78 


MATHEMATICAL APPLICATIONS 


the data as well as to screen the data to eliminate errors along 
with non-discriminating variables or measurements. One of the 
most important statistical aspects to be considered during this 
phase is that of the sampling plan to be used during the collection 
of the data, since one not only is interested in having his data 
representative of the general situation but is also interested in the 
efficiency of his data collection procedures. 

The Hypothesis or Relationship Testing Phase —After analyzing 
the initial set of data, the researcher progresses to the stage where 
he can generate hypotheses with regard to the phnomenon under 
study. Since these hypotheses generally were acquired using 
available observational data, it is required to test statistically such 
hypotheses. To do this it is generally necessary for one to develop 
an experimental design or sampling plan from which one effici¬ 
ently acquires new data upon which such statistical tests of the 
hypotheses can be generated. 

The Decision-Making Phase —Once a phenomenon is well enough 
understood, at least as far as its role in applied systems is con¬ 
cerned, interest centers on the use of this understanding of the 
phenomena to improve decision-making capabilities. In statistics, 
this type of activity falls under the general category of statistical 
decision theory as applied to complex systems. The natural out¬ 
growth of system decision-making is that of stochastic modeling 
of the system, including the study of such models through the use 
of simulation techniques. In fact, many individuals feel that the 
ultimate objective of research is the formulation of a stochastic or 
probability model of the phenomenon under study and the use of 
such a model to improve ones decision-making capabilities. 

It is not the intention of this paper to consider in depth the 
nature of scientific study, but simply to note the natural phases 
through which many such statistically oriented studies evolve. 

The Role of the Computer 

It has been generally recognized that the modem digital com¬ 
puter has greatly enhanced man’s ability to process data and to 
do scientific computing. In fact, it may be conservatively stated 
that in the last twenty years computational power has increased 
six orders of magnitude, indicating that in some respects what¬ 
ever data processing was evolved in the 1940’s can now be done 
a million times faster. Of real interest is how this increased capa¬ 
bility will affect our scientific analyses in the future. This becomes 
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especially important when one realizes that most problems of 
interest are multivariate, often involving the time variable. Until 
recently, most analyses were unable to cope adequately with such 
dynamic multivariate problems, and thus the researcher had to be 
satisfied with analyses that involved fairly restrictive assumptions! 
It is only natural to expect that some of the increased computa- 
faonal power available in the modern high-speed digital computer 

will be harnessed so as to reduce dramatically these restrictive^ 
assumptions. 

Still another aspect of modem computers is their ability to use 
internally stored programs. This means that sophisticated analyses,! 
once they are programmed, can be made available to individual^ 
all over the world. This capability should enable us to bridge the 1 
technological gap that exists between modem theory and practiced 
t is to be expected that the emerging sciences should find it' 1 
possible to take advantage of theories and techniques that have' 
been evolved in the more established fields without having to go 1 

through the long evolutionary mathematical process required in! 

the past. 1 

These two aspects of modem scientific computers challenges • 
one to consider how advanced statistical techniques may be intro-! 
duced into scientific disciplines in a fashion that would enable one : 
to apply these new techniques to his particular research problem 
with the minimum of difficulty. j 

Advanced Statistical Applications j 

I have reviewed the recent statistical literature and have selected | 
from the new techniques found in the literature those which 1 
appear to be most promising, keeping in mind the dynamic- i 
multivariate nature of most applied research problems and the I 
capability of modem digital computers. In the remainder of this 
paper, I would like to discuss briefly seven such advanced tech- ! 
niques. In these discussions, I will try to follow a regular pattern; ' 

(a) the type of problem for which the technique is appropriate 

(b) how the technique “solves” the problem, (c) what are the ' 

general advantages in utilizing a computer when applying the ! 
technique, and (d) a small list of appropriate references. I 

The techniques selected are: ; 

Data Analysis 

I. Factor Analysis 

II. Power Spectrum Analysis ' 
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Hypothesis Testing 

III. Cluster Analysis - Numerical Taxonomy 

IV. General Linear Hypothesis 
Decision Making 

V. Simultaneous Regression 

VI. Response Surface Analysis 

VII. Classification Techniques 

I. Factor Analysis 

In factor analysis, one is concerned with how to account for the 
observed correlation among all the observed variables associated 
with the phenomenon under study in terms of the smallest number 
of factors and the smallest residual error. In many respects, the 
problem considered by a factor analysis is that of attempting to 
reduce the number of variables needed to describe a phenomenon. 
It is recognized that if one indiscriminately adds more variables 
to his observational vector the law of diminishing return sets in, 
and soon one is losing rather than gaining information because of 
the noise introduced by the additional variables. Thus, one would 
like to find a new set of variables (factors) which are essentially 
uncorrelated, each of which adds significantly to the information. 

In the analysis, no distinction is made between so-called inde¬ 
pendent and dependent variables, since prediction is not a con¬ 
sideration. Thus, while in regression the constants found as regres¬ 
sion coefficients are merely constants used in the prediction, in 
factor analysis the constants obtained suffer from the demand that 
the weights they give to the derived variables must admit to inter¬ 
pretation and the derived variables must have a scientifically 
meaningful interpretation. Fundamentally, the object is to dis¬ 
cover whether the variables can be made to exhibit some under¬ 
lying order that may throw light on the processes that produce 
the individual differences shown in all the variables. 


In a factor analysis, the basic mathematical model is 
S J1 =C J1 Xu + C j2 x 2i + . • • + CfcX*! + • • • + Q q Xqi 
where Sji is the “score” (measure) made by the 1 individual 
on the j th “test” (variable) 

x ql is the measure for the i th individual on the q th factor 
(uncorrelated reference ability) 

and C jq is the weight given the q th factor relative to the 
j th variable 
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To detennme the weightings, the C*’s, the correlation matrix of 
the original observational variables is “factored.” The mathemati¬ 
cal problem associated with this factoring, since factoring is not 
unique, is to make the factoring such that the smallest number of 
interpretable factors are used, leaving an insignificant unexplained 
residual It is evident that such a technique requires judgment oh 

® pa !' t ° f * e tnvestigator, and, in fact, the several different 1 
methods of factoring the correlation matrix appear to yield differ¬ 
ent levels of effectiveness, depending on the particular type of 
problem being considered. ^ 

The most common solution which has been programmed for 
digital computers is that which is called the principal component' 
solution coupled with an orthogonal rotation of the factor matrix I 
Input data for such programs usually are in the form of raw data 
ut may simply be the resulting inter-correlation matrix. Included 
m the output of most computer programs is the initial factor matrix 1 
which is simply the coefficients C jq and the orthogonal rotated i 
actor matrix The satisfactory nature of the solution depends upon ! 
the ability of the analyst to interpret the factors effectively, con- ! 
sidering their loadings relative to the original variables. 

The power of the .digital computer to perform a factor analysis 
is clearly indicated when one considers that there exists a program ! 
which will handle up to 80 variables with up to 10,000 cases (in- ' 
viduals) using as little as an hour of running time for this size 
problem. In fact without a computer, a problem of this magni- ! 
tude simply could not be analyzed. 6 

II. Power Spectrum Analysis 

One frequently obtains observations that are in the form of a 
continuous or discrete time series. Often such series are of a ! 
type that is called stationary. By this we mean that, if a ran- 
dom sampling is made from the time series with equal times 
between observations obtaining the sequence of observations 
x I} X2,X3,...,Xt...,x n , these x’s are such that for all t’s E(x t )=u the 
vanance of x t -V 0 , and the covariance (x t ,x t+s ) = V s for all integer 1 
s. Of particular importance is the fact that the covariance between ! 
two observations x t and x t+s depend only on the time separation s : 
and not on die clock time t. One should recall that the covariance 
(Xt,Xt+s) is denned as 


Cov(x t ,x t +s) = E 


(*t-/4 1 ) (x t+ .-/*t+.) 


: r or or 

s Xt Xt+s 
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Thus a stationary time series is such that the correlation between 
two observations does not depend upon where one takes the ob¬ 
servations but only on how far apart the two observations are. 
We really are dealing with a signal that exhibits periodicity over 
time rather than dynamic change. The problem associated with 
the analysis of such stationary time series is to convert the ana¬ 
logue signal into quantitative values which will then admit more 

readily to mathematical analysis. 

Since the time series is periodic, it seems natural to assume that 
the signal is a composite of several cosine functions of varying 
frequency and amplitude. Along with these cosine functions one 
assumes that there is superimposed a random noise factor, thus 
one can hope to decompose the time series into the sigmfican 
cosine functions and to replace the continuous set of data with a 
finite number of frequencies and associated amplitudes of these 
cosine functions. These frequencies and amplitudes can then be 
used in pattern recognition type problems so as to be able more 
readily to recognize patterns in the signal and to be able to dis¬ 
tinguish between signals coming from different underlying con¬ 
ditions or sources. 

In solving this decomposition problem through the use ot a 
power spectrum analysis, one formally considers that the signal is 
|iven as a function of time, say x(t). Then the auto covariance 
function is defined by 


r 

li m _I 

C(t)- t _^0 OT j 


+ T/. 

x(t) x(t + r)dt 

T/ 2 


and the power spectrum frequency is given by 

P(f) 


lim 1 
T->~00 T 


T/, 

x(t)e 


i2irft 

dt 


T/. 


Now if the signal x(t) corresponds closely with the function cos 
277 -ft the value of P(f) is large, while if the signal fails to corres¬ 
pond, P(f) will be small. In fact, if there is actually no correspond¬ 
ence,?^), though theoretically zero, would exhibit some positive 
value due to the random noise entering the evaluation. Pictonally 
we could represent the stationary time series by the following 

figure: 
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° f SUCh 3 ““ $erieS ° 0Uld ^ * 



repreS “ n ’ “ ^ be "“g^ed that the original 
tinn? ^ f P fisting namely of three sinusoidal func¬ 
tions with frequencies f„ f„ and f, with amplitudes whose squares 
(power) correspond to the heights of three peaks elb Z on 
the power spectrum graph. on 

In practice, the problem of estimating the power spectrum 

the° U f 1 t f he USe ° f 4sital 00m P uters requires first that one^eplace 
&e infinite range considered in the theory by a finite range Now' 

it is apparent that the restriction on the range of the dfta usS 
restricts the frequency range for which one can obtain estimate 
The lowest practical frequency that can be estimated coireZfds 
to one-half the range use. Next, one must replace the conttnuous 
analogue signal with isolated sampled points. The sampling rate 
used also imposes a restriction on the range of frequence that 
can be studied, since the highest frequency about which one will 
ave information is ir/k, where k is the length of time between 
sampled points. Unfortunately, the desire to take a longer range 
of date with more frequent samplings is usually thwarted by lade 
of sufficient data and/or the lack of compute/energy to analyze 
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the multiplicity of points generated by a high sampling rate 
In the computer programming of a power-spectrum analyse, 
the dimensions of the program depend on the size of the com¬ 
puter being used; but for a large-size computer 
[hat can handle up to 20 different series with up to i m 'discrete 
data points per series. In computing the 

200 lags can be considered. Not only does the output of the pro 
gram Llude a plot of the input data, the aixtoconre'ataon func¬ 
tion, and printed and plotted power spectral estimates, but one is 
enabled to examine the interrelationship between two senes by 
having the computer determine and print out the 
and the “coherence function” for any pair of signals from the 
several series used as input for the program. 

III. Cluster Analysis or Numerical Taxonomy 
In many investigations, the amount of information obtained 
about each individual and the number of individuals studied are 
so large that the investigator finds it difficult to no ^ * 
start his analysis. Since one of the purposes of science is to ge 
eralize one often would like to group individuals together into 
more or less homogeneous groups relative to the several measure¬ 
ments that have bfen made on each individual. At the same time, 
it may be of interest to differentiate between the numerous varia¬ 
bles Is to those which provide discriminatory power m separating 

°\ Sl cluster analysis or numerical taxonomy program is designed 
to unc^Tstatistical similarities within the data, to form clusters 
of the most similar cases, and to select those attributes which 
are statistically more important in determining the ckoiflcataon 
derived through this method. With the large mass of data and the 
many observations from each individual, the problem is to identify 
a few main classes of data rather than the many individual cases. 
Technically one may argue that the data represents a mixture 
of samples 7 from several distinct populations and that one is inter¬ 
ested to sorting the observations into their respective population 
groups without even knowing how many populations are mvcive . 
fn some respects the technique could be called statistical sorting. 

In most numerical taxonomy programs, the analysis as ap- 
nroached by first converting the data into attributes. That is, the 
[nZmation is reduced to a set of variables which can take on only 
the values zero or one. It is apparent that both quantitative and 
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qualitative data can be converted into attributes, since the attrib¬ 
ute concept is essentially the basis behind all information. Thus 
or categorical type data, each possible class from a given categonJ 
can be made an attribute variable with the zero indicating the 

i?in^h Ual l 1S n ? 'l* 6 daS r and the 0ne indicatin g the individual 
c LV a C i a - S ' In the CaSC ° f measured variables, the range can be 

2 iT man l SUhrangeS and each subran g e be considered 

as an attribute variable. 

The analysis then generates a similarity coefficient which essen- 
bally indicates which observations have essentially the same 
attribute structure. Thus we may use 7 

S U =M U /N W 

where M m repersents the number of attributes possessed in com¬ 
mon by cases i and j, while N i4 represents the number of attributes 
possessed by either of them. The similarity ratio S M could be con¬ 
sidered as the weighted probability in findhig a matching attribute 
between the two individuals for any characteristics selected at 

Zt )m , Hence ’ f ver y Iar g e ^Ine of S would indicate a much 
greater degree of similarity between the two cases than is likelv 
with random distribution of the attributes. Similarly, a very low 
value of S would indicate a non-random divergence of character¬ 
istics between the two cases. 

In order to measure roughly how “typical” a case is, a count R 
is made of the number of other cases with which the case in ques¬ 
tion has at least one attribute in common. Finally, a measure H 
is made for each case by multiplying together all the non-zero R 
values of S for each case. Thus, 
ffi Sn x Si 2 x Si 3 x ... x Si,n 

Here H, can be thought of as representing the probability, for any 
characteristic selected at random of those attributes processed by 

attributed ha f ^, n ° n ‘ Zero R cases wou]d possess the 
attribute. We have for each ease two measures of typicality, R, and 

S’ d? 5~ C “ te Considered as * "toed measure that could 
be used to differentiate cases with the same R 

Thus, one may rank the cases first by descending values of R 
and then by descending values of TI within each value of R. The 
first case listed can then be considered as the center of the first 
population grouping and other members of the cluster are then 
identified by high values of S in conjunction with this nodal case. 
The problem of where to cut off the cluster must be resolved 
either by using judgment or by the ordered listing of (RH)’s 
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One may decide that the second cluster center should be taken 
to be the individual who is at some predetermined rank within 
the table. When that case is found in the ordering of S’s around 
the first cluster center, the cluster is truncated and a new cluster 
determined around the new center using the S’s associated with 
the remaining cases and the new nodal case. Thus, the clustering 
can continue until all cases are sorted into one of the clusters. 

The entire approach to numerical taxonomy makes the use or 
a high-speed digital computer most appropriate. The many 
comparisons and testings made in deriving values for M,N,S, , 
and H are operations which a computer handles naturally. Ihe 
sheer volume of the operations also requires a processing speed 
found only in electronic computers. For example, if there are on y 
100 cases in the collection, then 4,950 values each of M,N, and b 
must be calculated for every analysis made with new or revised 
information. The basic approach described above is quite suitable 
for programming for most types of computers. 

IV. General Linear Hypothesis 
In experimental situations it is common to analyze multiresponse 
experimental data using the approprate univariate analysis on 
each response. However, there is often an interdependency be¬ 
tween these responses which would be ignored by this single- 
response-at-a-time approach. At the same time, the various uni¬ 
variate analyses that one may elect to use depending upon the cir¬ 
cumstances of the experiment, and even when considered sepa¬ 
rately, involve the analysis of some underlying mathematical 
model, including both the estimation of parameters and the testing 
of hypotheses associated with the model. The “general linear 
hypothesis” concept combines these mathematical models into a 
single general model enabling one to use a single analytical ap¬ 
proach to such problems and at the same time to handle the multi¬ 
response problem. , . . . 

It seems best to introduce the general linear hypothesis concept 

in the single response variable case and then to discuss its gen¬ 
eralization to multi-response type analyses. An attempt will e 
made to show how the general linear hypothesis model includes 
the analysis of variance, regression, and the analysis of covariance 
as special cases. 

In an attempt to show the generality of the model and the power 
of the matrix notation, the generalized linear model concept will 
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be summarized in matrix form and then the model will be applied 
to one or more of the special cases noted above, expanding the' 
expressions into a non-matrix notation. A mathematical model 1 
must first be evolved showing how one feels the response variable 
(dependent variable) relates to the design variables (indepen-: 

dent vimables). In the genera] linear case, we have the model* : 
y—A£ -f- e. 

This model reduces to the following special cases: 
i) The analysis of variance model (simple two-way design) 
y»~M> ~f~ r f + z'j + Oij 

w here r, is the i th treatment effect, Vj is the j th block effect 
jx the overall mean and e u the experimental error. 

11 ) The regression model (multiple linear regression) 

Yi—fx + .fBiXu -f- /3 2 x 2i -f-... -J- /3 r x ri -J- ei 
The analysis of covarience model (simple two-way design) 
yij —fx + n -f- Vi -j- /3iX Uj -J- /3 2 x 2jij -j- x -j -f- e 

To demonstrate how the general model reduces to these special 
models, consider the regression case. Then + e can he 

written in expanded form as: y T- an be 


hi) 


f y* ' 

y 2 


” 1 Xu 
1 X 21 

X« 

x 22 

. . . Xir 

... x 2r 


B x 

b 2 


ex 

e 2 

- . 


1 X„x 

x n2 

. . X„ r 


B r 

+ 

e n 


^ V. J L “ J 

Expanding the expression for the i th element yields the form 
given under (ii) above. 


Associated with any mathematical model is a set of hypotheses 
regarding the values of the parameters f which are 7 the un¬ 
known constants of interest in the study. In a linear hypothesis 

stans 0 ^’!* 656 ^ 0 ^^ 68 mUSt be “P 1 ® 88 ® 51 * as bnear relation¬ 
ships involving (s and known coefficients. We thus have in the 
general case the set of linear hypotheses expressed as 
C<~ — o 

where C is a known matrix of coefficients. 

To illustrate again the applicability of the general expression to 

a matrix. ItaKC l0WCr C3Se lettCr indic3teS 3 VeCtor ’ while uPPer case letter indicates 




88 


MATHEMATICAL APPLICATIONS 


the special cases, we have for typical types of hypotheses the 
following: 

i) The analysis of variance model 

The hypothesis that a certain set of the treatment effects are all 
equal. 

ii) The regression model 

/3„-,=/3,~. = .--=fr=0 

The hypothesis that the last s independent variables have no 
linear relationship with the dependent variables when consid- 
ered with the other r-s variables, 

iii) Analysis of covariance model 

Ti = T 2 . • . = Tt 

/3r-s-l = /3r-s-2 = . . . = /3r = 0 

A combination of the hypotheses introduced under (i) and (n). 
Before the given set of hypotheses can be statistically test , 
one must first estimate the parameters, t using the available e - 
perimental data. These estimates are given general form by 
^(A'A)- 1 My . 

where the prime indicates the transpose of the given matrix and 
( Y 1 indicates its inverse. In the special cases these estimates re¬ 
duce to the familiar least squares estimates. Thus m regression 
n 

A S(y y -yO( x irXi) 

j8i=t!__ 


The testing of the hypotheses is accomplished by using a test 

statistic in the form 

. , SSH/n h 

F(n h ,ne)- SSE / nc 

where SSH==(C£)[C(A'A) C] C£ 

and SSE =y'y - f A f y 

The statistic F(n h ,n.) is the Snedecor F statistic with n_ and n. 
renresentine the degrees of freedom for the hypothesis and error 
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respectively. An alternative statistic that can be used is the Beta 
statistic defined by 

P (t’f) = sse/(SSH + sse > 

The hypothesis is then rejected for significantly low values of (3 . 

Now, if the response is a vector quantity rather than a single 
measure, the generalization is relatively straightforward; since, in¬ 
stead of considering singly y, we must consider the vector (y lt y % , 

..., y P ), and the general linear hypothesis must be written in the 
form 

(yi, t/ 2 ,..., y p ) = A(&, &,..., f p ) -f (e t , e *,..., e p ) 

Thus, the vectors of observations, parameters and error terms all 
become matrices, and we can symbolically write the model as 

Y = A 3> + E 

The subsequent analysis of this generalized model generalizes in 
a rather straightforward fashion (See Poston). 

In considering the implications of this theory to computer pro¬ 
gramming, one should realize that matrix algebra can be program¬ 
med for computers in a fairly straightforward fashion. In fact, sev¬ 
eral computer programming languages have been developed utiliz¬ 
ing vectors and matrices as basic elements in the language (See 
Bargmann). Thus, the researcher having access to such a com¬ 
puter program can analyze a large class of experiments without 
having to develop separate techniques for each special case. 

V. Simultaneous Linear Regression 
Many problems arise that require the construction of a mathe¬ 
matical model that will represent the operation of a social, politi- 
cal, or economic system. From this model one is interested in 
predicting future events that will follow when one or more vari¬ 
ables in the model are changed or determining what policy should 
be followed to give a desired result or outcome in the system. Or 
perhaps the purpose of the model is simply to describe the system 
in mathematical form. 

Given a model, one of the major research problems is to esti¬ 
mate its parameters. When the model is not explicitly stated, one 
often has to resort to simulation studies to help make reasonable 
estimates; however, in the case of a system expressed by simul- 










90 


MATHEMATICAL APPLICATIONS 


taneous linear equations, the parameter estimates can be mathe¬ 
matically determined. One of the important models of this type is 
that of multiple linear regression, where one assumes a single de¬ 
pendent variable is governed in a linear fashion by the levels of a 
number of other “independent” variables. Simultaneous linear re¬ 
gression can be considered as a generality of multiple regression 
to the following situation. In the multiple regression relation, one 
and only one variable in each equation may be chosen as the de¬ 
pendent variable, whose changes can be explained by those of the 
explanatory, or independent, variables. Very often this choice is 
arbitrary, since economic and social relationships are not norma y 
formulated in a simple manner. A typical example from economics 
would be price and quantity for a product. Surely one would be 
hard pressed to determine which to call dependent if the two 
occur in a set of additional “independent” variables. In fact, even 
if such a designation were made, the multiple linear regression 
model would not be properly estimated. 

In simultaneous linear regression, one is able to introduce as the 
model a system of simultaneous linear equations rather than a 
single regression equation. The system is termed the structural set 
of equations, since each equation relates to a fundamental aspect 
of the phenomena being studied. Each structural equation may 
have more than one dependent variable and a number of inde¬ 
pendent variables. 

For example, one may have the following five equations in the 
structural set:* 

yi = bi 2 y 2 + bi 5 y 5 + Ci 3 z 3 + + Cio 

yi = b 23 y 3 + b 25 ys + c 23 z 3 -j- c 20 
y 2 = c 32 z 2 ~r C34Z4 -r c so 
Vs — b 44 y4 -j- C41Z1 -T c 43 z 3 ~r °4o 
y 4 — b 55 y5 -f- C 53 z 3 -j- C50 

where the y s are dependent variables, the z’s are independent 
variables, and the b’s and cs are the corresponding regression co- 
efficients. 

Since there are several methods by which the regression co¬ 
efficients in such a set of simultaneous regression equations can 
be estimated, computer programs vary as to which or how many 
of these methods are included. Thus, we may use: 

*See M. A. Girshick and T. Haavelmo, "Statistical Analysis of the Demand For 
Food: Examples of Simultaneous Estimation of Structural Equations, Econometnca, 
XV (1947), 79-110. 
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W SinglMqoa® least-squares where the estimations are oh- 
me y considering each equation separately, assuming that for 
estimation purposes the first dependent variable in Station is 

are oh?^.^ VambIe - Usua % if this method is used the results 
are obtained for comparison purposes only. 

(11) Two-stage least-squares still uses the single eauation an 

for allVV “ th r e ? St Stage a correction is made S in the estimate 
for all but one of the dependent variables in the 090^00-andt 

Sc 6 r~? S of’ S ' 556 to compuS 

oth d6Pendent V3riabIeS ° n aU the 

(in) Limited-information estimation again uses the sin aU ™ 

mrthods m making the approximation. 

with theTop'MshcahonTthf method * 1 S’SJ a” 
mates, and often one is led to reduction 

soph “ approaohes fa OTder 

The use of this approach in the analysis of a system or nhe 

among the variables to be studied. In the politics 7 „d* P 

^teP^s^lfitle^vadabl^in 

sys'tem^mreTr 1 j^tf anTtnd^eTest^ 

study the work done by others in related fields ° n6 ““ d ° 15 ‘° 
VI. Response Surface Analysis 

3 neW ° 0nCePt “ t0 6Xperi - 
so much interested in testing the sig^fic^ce ofTactors associated 

has become known as “response surface designs,” »d“s S0 

k "t”r 
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The behavior of any reaction is governed by laws which should 
be representable in mathematical form and ‘^'^ou^^ep 
sible to determine the optimum conditions for t ^ 

simply applying these laws. However^ ^heated 

fViat the underlying mechanism of the system 1 .i or „ 

that the mathematical ^^"^ociai 

tions is essentially impossible. This is particular y 

response surface designs are appropriate. Y • j 0 _ 

The theory is developed assuming that the response, , 

• hi OI , v whioh are capable of measurement 

pendent upon nv-mbl- X whi^ are c^p y = ^ 

a “tTfuni “dt^roblem is to find *e -mb" 
values of * which optimize the response within the region of the 

mdimensional facto/space where -P"" “ 0 TS, g 
oc f pvnerimental observations as possible. The number or od 

servations P required will, of course, depend upon the accuracy an 
precision of «timation desired. Where the problem is one* of num- 
P . .. 9 l W avs be converted to one of maximization, tor 

example,' by considering the improvement as compared with some 
standard instead of the actual level achieved. 

The technique assumes that the response function can be sat - 
fartorily represented by a quadratic form in the area of interes , 

1 ' e ” Y=Xi n =oX 3 " > i Cy Xi X 3 + e , 

where Y is the property to be maximized, the x» are * he le ^ e s 
tn inCndent'variables (* - D> the c i3 are the unknown 
mmmeters to be estimated from the experiment and e is the resi- 
P duZfexperimental error. The adequacy of the,«atao surface 
representation of the true response surface of the proems be g 
investigated depends on the use of a small sub-region of the factor 
snace within which one restricts his determinations. In some ex- 
perimental situations, such a small 

the optimum point can be assumed to lie is already know to he 
experimenter from previous experience. However, ^ * n ° 
case the procedure of locating optimum conditions involves tw 
distinct phases. The first phase involves the location o 
borhood, while the second is to determine within the neighbor 
i_riTiHmiim noint. 
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The location of the neighborhood is accomplished by using 
what is called the “method of steepest ascent.” In this procedure, 
one assumes that the surface can be represented locally by a slop- 
ing plane. Starting at any point, P, the experimenter estimates the 
coefficients or slopes of the plane Y = b 0 + b*. + b 2 x 2 + ... b n v n 
by performing a suitably arranged set of trials in a small sub- 
region about P. From these observations, the coefficients are esti¬ 
mated and one then calculates the direction of steepest ascent or 
greatest slope up the plane. He then proceeds to a point, O, in 
this direction, where new observations are made, the slopes are 
redetermined, and the process repeated. In this way, by a step-by- 
ste P procedure, points of higher and higher response are reached. 

This procedure cannot, however, be used actually to reach the 
maximum response point since, as one goes farther up the surface, 
the slopes become more gradual and thus more difficult to esti¬ 
mate. The second-order terms also become relatively more import- 
ant. The procedure generally followed is to compare the linear 
effects with the error variance and with the second-order effects, 
and if the linear model appears adequate, the path of steepest 
ascent is determined. At the point of diminishing returns, the new 
pomt is located around which the process is repeated. 

The experimental design used during the first phase where one 
is seeking the path of steepest ascent from a given point on the 
surface is generally of a two-level factorial type, where the origin 
for each variable is taken at the initial point and the levels used 
are equidistant from it in either direction. Thus, in a three- vari¬ 
able situation, one would use a 2 3 factorial design, and the eight 
experimental points would be as shown in Table 1 . 

Table 1 . Experimental Points for a 2 3 Factorial Design 

Factor Level 


Xi 

X 2 

X 3 

+1 

+1 

+ 1 

-1 

+1 

+ 1 

+1 

—1 

+ 1 

—1 

—1 

+ 1 

+1 

+1 

— 1 

—1 

+1 

-1 

+1 

—1 

— 1 

—1 

—1 

-1 
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The estimation of the b’s from this type of design is straight¬ 
forward. In fact, if cr 2 is the experimental error variance, we have 
bi=Xxiy/2xi 2 , and V (b)-cr 2 /Xx i 2 (the variance of b). 

One thus has the essential ingredients needed to complete the 
first phase of the investigation. 

In considering the second phase, we assume that we have identi¬ 
fied a point P that is in the neighborhood of the optimum point. 
The experimental designs used at this stage of the problem are 
known as composite designs. There are two types of composite 
designs, central and non-central. The central composite designs 
consider the 2 n factorial designs and adds additional points with 
high and low levels for each variable as well as additional points 
at the center of the design. 

The central composite design for n=2 is shown in Figure 1. 
The 2 2 factorial points are given as solid points while the added 
points are open circles. 

Figure 1. A Two-Dimensional Central Composite Design 



For the purpose of estimating the parameters of the quadratic 
form, the central composite design can be shown to be more effici¬ 
ent than the 3 n factorial dseign. As one might expect, this means 
that a saving in experimental points can be realized, since interest 
has been narrowed to estimating the optimum response point 
rather than to studying generally the nature of the mathematical 
model that explains the process under study. 

The location of an optimum point usually requires a series of 
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coordinated experiments, especially when one must first find the 
neighborhood of the optimum. If the process being studied has 
little or no time effect, so that one can combine results that are 
obtained at different time intervals, the series of experiments can 
often be developed into an organized sequential program. The 
non-central composite designs are useful if one uses such a se¬ 
quential approach to his experimentation. The factorial portion 
and the central point are run first and, if the optimum is found to 
be close to the center being used in the factorial design, the addi¬ 
tional points required for the central composite design are then 
used. If, however, the optimum response is nearer one of the other 
points the factorial portion is augmented to form a non-central 
composite design. Of course, if it is indicated that a new location 
should be sought through the use of the path of steepest ascent, 
then the sequence is as follows. The fitting of the quadratic 
surface, Y — 2 i= 0 CijXiXj -f- e, to the observations realized 
from the composite design can be obtained by standard multiple 
regression techniques. Following the estimation of the coefficients, 
one can perform an analysis of variance on the results to establish 
the significance of the several coefficients as well as the signifi¬ 
cance of the regression itself. If one has some prior information as 
to the value of o- 2 , the experimental error, this information can be 
used in a comparison with the residual mean square associated 
with the regression analysis to provide a test of goodness of fit of 
the second-degree equation. If the fit is not satisfactory, one may 
change his neighborhood if this seems required, or increase the 
order of the regression equation. 

When such a test has indicated that an adequate fit has been 
obtained, the fact that an individual coefficient is or is not sta¬ 
tistically significant is of no practical significance. What this means 
is that one might just as well retain the small coefficient in his 
future analyses, since there appears to be no really good reason 
for making the hypothesis that one of the coefficients is actually 
zero in the population model. 

When the second-degree equation has been fitted, it is necessary 
to interpret it to see if one can, in fact, determine the coordinates 
of the optimum response point. Since the coefficients in a general 
quadratic do not readily convey to the observer the nature of the 
surface being represented, one usually resorts to a canonical re¬ 
duction of the equation so as to obtain the canonical form 
Y = bo -f~ bll^ 2 ! ~b b 2 2X 2 2 + ... bnnX 2 n. 
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There are many types of surfaces that can be obtained through 
the use of the quadratic function. Under certain conditions, includ¬ 
ing those where all the b’s are negative, there will be a point maxi¬ 
mum in all the variables. Another situation, however, that may be 
encountered is where the maximum is in fact remote from the re¬ 
gion of the design, but the surface is elongated along an axis 
which passes close to the design. This indicates that the previous 
experimentation has brought the experimenter not to a maximum 
but close to a rising ridge of the surface. No conclusion as to 
optimum conditions can be drawn in this latter case, but one can, 
from observation of the nature of the rising ridge, determine where 
additional experimentation should be carried out in attempting to 
locate the optimum point. In the case that the optimum point falls 
within the region of the experiment, its position can be obtained 
by differentiating the original quadratic with respect to the vari¬ 
ables Xi> X«> ••• X« i n turn and equating the results to zero. This will 
yield a set of linear equations which, when solved simultaneously, 
give the coordinates of the optimum point. It should be empha¬ 
sized, however, that the nature of the surface should be critically 
examined through the use of the canonical transformations ap¬ 
proach before one seeks these coordinates. In fact, as the dimen¬ 
sion of the problem increases, making a careful examination be¬ 
comes most important. 

The mechanics of analyzing the data obtained from the se¬ 
quence of observations made in following the approach outlined 
above can be readily adapted to digital computer programs. In 
fact, many of the procedures make use of techniques for which 
standard computer programs are already generally available. Thus 
in the initial phase, where one is interested in following the path 
of steepest ascent using a linear fit to the experimental data, 
multiple regression computer programs are applicable. These pro¬ 
grams give not only the best estimates for the regression co¬ 
efficients but also their standard errors as well as the standard 
error of estimate for the response variable. Through the use of 
transformations, the significance of the quadratic terms in the 
surface can also be readily tested using the same computer pro¬ 
gram. This enables one to determine when to abandon the steep¬ 
est ascent phase of the investigation. 

In the calculation of the actual path of steepest ascent, the 
successive differentiation of the fitted linear relationship yields 
simultaneous linear equations whose solution can be obtained 
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from standard programs for solving systems of simultaneous equa¬ 
tions. In fact, even the determination of the possible steps up the 
path through the computation of coordinates of the points on the 
path can easily be programmed. 

When one reaches the point of fitting the quadratic surface to 
the data obtained from a composite design, the determination of 
the coefficients of the surface, their standard errors and the stand¬ 
ard error of estimate is also a multiple regression program appli¬ 
cation. The quadratic terms are simply treated as new linear 
variables in this case. The determination of the optimum point is 
again the solution of a set of simultaneous linear equations. 

VII. Classification Techniques 

The theory of statistical classification deals with the problem of 
assigning one or more individuals to one of several possible groups 
or populations on the basis of a set of characteristics observed 
among them. Thus, the problem of classification can be considered 
as a special case or application of multi-variate decision theory. The 
nature of the observed characteristics may vary from problem to 
problem. In some cases they may be all of a measured type, while 
in another situation the variables may all be of the simple cate¬ 
gorical type of attributes in which each observation can take on 
but one of a finite number of distinct values or states. Siegel has 
noted that “measurements may, in general, be from four scales: 
the nominal, ordinal, interval, and ratio scales. In any given multi¬ 
variate classification problem, the measurements may be of a mix¬ 
ture involving some or all of these types of variables.” It should 
be expected that numerous approaches have been advanced as to 
how one should go about evolving a classification decision rule. 

It should be recognized that since the area of interest has been 
designated as “statistical” classification, this means that the deci¬ 
sion rule must be based upon observational data available from 
samples from the several populations rather than on known popu¬ 
lation characteristics. Thus we assume that we have a sample of 
individuals from each population and for each of these individuals 
we have available the same set of observations as are available 
for the individual requiring classification. 

Consider for illustration a well-known classification problem, 
that of a prospective student applying for admission by submitting 
credentials such as his high school records and in addition being 
given a battery of admission tests. These data become the multi- 
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variate set of observations available on each applicant. The prob¬ 
lem is to classify, in advance, the applicant into the population to 
which he belongs, where the alternatives are the population of 
those students who can successfully complete college training and 
the population of students who will not complete the college 
courses successfully. Available to the admissions office are the 
same data on former students, some known to have completed and 
the remaining known not to have completed college. 

Let us now look at the steps required to evolve a classification 
rule. Statistical classification rules, in general, depend either upon 
the concept of likelihood where one considers the ratios of the 
likelihoods that the observation to be classified came from the 
suspect populations, or they depend upon the value of some classi¬ 
fication statistic whose form is assumed and is evaluated for the 
individual requiring classification. The samples that are available 
from each population are used to estimate the likelihood ratios or 
the constants in the classification statistic, depending on which 
approach is being used. 

There are four major steps that must be accomplished if one is 
to evolve a classification rule, in brief: selection of the variables, 
selection of the classification technique, selection of the decision 
rule, and an analysis of effectiveness. These we now consider. 

The Selection of the Variables to be Used in Making the Classi¬ 
fication. 

Here one encounters problems such as whether or not to include 
in his observational vector variables of different types, how re¬ 
liably each available variable can be measured or determined, the 
discrimination power of the variable relative to the populations of 
interest, the inter-relationship of the variables, and the cost of 
making each variable determination. The decisions of selection de¬ 
pend in the main on personal judgments, since at present no good 
selection rule exists. 

The Technique to be Used in Making the Classification Estimate 
and the Use of Available Sample Data to Make the Estimates. 

One can identify several estimation techniques in the literature; 
however, the best known technique for measured variables is to 
use the Wald Statistic, which is simply a linear function of the 
observations in the form 

P P . (2) (11. 

W(z) = 2 

q=l p=l 
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where o- pq general term in the inverse of the common 
covariance matrix and ^\ = mean of Xq in population tt, 
Selection of the Decision Rule to he Used in Making the Actual 
Classification Decision for a Given Observation. 

To discuss this step at this stage, it seems best to restrict our; 
consideration to the two-population classification problem. We 
then have available for making the classification decision either a 
likelihood ratio that is a numerical function of the observational 
vector z, say, L (z), or we have a classification statistic defined! 
as a numerical function of z, say C(z). In either case a decision! 
rule is then simply the division of the L(z) or C(z) one- 
dimensional interval into two regions such that for those z’s that! 
yield an L(z) or C(z) that falls in region two, the individual will 1 
be classified into population two. Thus we have reduced the prob¬ 
lem of classification to that of determining the region. 

Determining the Operational Effectiveness of the Classification 
1 echmque. 

Basic to the measurement of the operational effectiveness of any 
classification technique are the probabilities: 
p(i\j)=the probability of misclassifying an individual who be- 
longs in population 7Tj into population ttu 
From these probabilities one can evolve expected cost estimates as 
well as other criteria of worth. To obtain estimates of these proba- 
bihties one requires the conditional distribution function of the 
likelihood ratios or the classification statistic used in the technique 
In some cases these distributions can be expressed either exactly 
or approximately in mathematical form and then the misclassifi- 
cation probability estimations simply require the evaluation of an 
integral over the required region. When such a mathematical 
representation is not available, an empirical approach can be used 
involving the individual observations available in the samples to 
produce an empirical estimation of the conditional distributions 
Here it seems best to discuss the details of this step around an 
actual problem. 

For our problem let us assume that the admission office requires 
an admission policy such that the probability of a student’s doing 
unsuccessful work if admitted should be less than or equal to oni 

1 ) The Control of Error Approach. 

What is needed is the distribution function of the statistic W(z) 
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since we would like to select X such that P[W(z) > X | z belongs 
t0 7ri ]=0.10. We know that W(z) is asymptotically normally dis¬ 
tributed under the condition that z belongs to ir x with the mean, 


_ P P 
Wi= X X crij 

i=i i=1 


^ (2) - 


and variance, 


P P 


V w — X X crij 

i=l i=l 




For the sample data, we find upon substituting the appropriate 
sample characteristics into the formula for the means and variance 

that 

Wi-7.746 and 
V w —3.676 

Thus we have to solve for X in the equation 


P(2 11) = 1 f 00 

y]£F J (X — Wi) / e - z 2 / 2 dz=0.10. 

From the table of areas under the normal curve we have 
1.282 =X - 7.746 
V&676 

and 

X=10.20 

and our classification decision rule can be stated as: 

“If W(z) = + 0.0350z, + 0.0448Z, + 0.12747z s > 10.20 classify the 

observation as belonging to 7r 2 . 

(That is, admit the student to the curriculum.) 

In a more general sense we can balance the two values or the 
two misclassification probabilities by selecting the appropriate 
value of X so as to meet any single constraint that might be im¬ 
posed. For example, one may wish to control the errors such that 
two probabilities are equal. It is evident that the solution of the 
resulting integral equation may require a numerical technique ot 

some sort. 

2) The Cost Control Approach. 

Consider in our student admission example that we have avail¬ 
able the cost factors: 
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C(2 | 1) — the cost of misclassifying an individual into population 
TTn, when he really belongs to iri (admitting a poor stu¬ 
dent) = 10 and 

C(1 | 2)=the cost of classifying an individual into population tti 
when he belongs to 7r 2 (failing to admit a good student) 
= 20 . 

qi—the a priori probability of a candidate for admission being 
from population 7 Ti=0.25. 

q 2 —the a priori probability of a candidate for admission being 
from population tt 2 =0.75. 

Then if we wish an admission policy that would operate so as to 
minimize the expected loss, we have that 
Lx=q ip (2 | l,A)c(2 | 1) + q 2 p(l | 2,A)c(l | 2) 
where L x is the expected loss. In our particular case, 

L x = (0.25) (10)p(2 | 1,A) + (0.75) (20)p(l I 2,A) =2.5p(2 I 1,A) 

+ 15.0p(l | 2,A). 1 

So we seek a A which would minimize L x . One can simply try dif¬ 
ferent values of A, determine the p(2 | 1, A) and p | 2,A) cor¬ 
responding to the A and then compute the L x . Since the relation¬ 
ship between L x and A is quite smooth, one can through such a 
trial procedure approximate the appropriate minimizing value of 
A within three or four steps. 

Determining the Effectiveness of the Above Classification Rule. 
In the case of the above two-populations—control of misclassifica- 
tion error situation—we compute the probabilities: 

P(2 | 1) =P (Admitting a student who subsequently does unsatis¬ 
factory work) 

= P (Classifying z into 7r 2 when z belongs to Tti), 

and 

P(1 I 2) P (Failing to admit a student who could do successful 
work) 

= P (Classifying z into *Ti when z belongs to tt 2 ). 

Under step 3 we determined the classification rule (i.e., the A) 
such that p(2 | 1) =0.10. To determine p(l I 2) we have ’ 

_ P P 

W 2 = S 2 o-ij /,u (2) — fi ai \ p w = 11.422, 

j = l i = \ [ | j ) i 

and, due to the equal covariance assumption 

Vw=3.676, 

so, 


4i- 
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(10.20-11.422) /V3.676 

P( 1 12) = 1 f e z 2 / 2 dz=0.26. 

V27r 00 

The rationale in these probability evaluations can best be ex¬ 
hibited graphically (Figure 1). 

Figure 1. Probability Evaluations 



Thus we find that the operational efiectiveness of the classifica¬ 
tion rule is such that P(2 | 1) =0.10 and P(11 2) =0.26. If one is 
disturbed over the size of P(1 | 2), he can either increase the al¬ 
lowable size of P (2 | 1) or he may seek additional or new variables 
that better discriminate between the two populations. 

Essentially, each of the classification techniques identified above 
follows the four main development steps that were enumerated in 
detail for the Wald Classification Statistic. Two additional prob¬ 
lems warrant special mention, however. 

The first is the so-called distribution problem. That is, the re¬ 
quirement to have some knowledge as to how the statistic or likeli¬ 
hood ratio being used is distributed in probability under the con¬ 
dition that an individual comes from i r 2 . This knowledge is re¬ 
quired if one wants to formulate the particular classification rule 
to meet an error control or cost criterion. It is also needed if one 
is to estimate measures of operational effectiveness. We used the 
information that W(z) was normally distributed to generate 
these distribution requirements in the student admission illustra¬ 
tive example. One may, however, be interested in using a classifi¬ 
cation technique for which the mathematical form of its condi¬ 
tional probability distribution is unknown. In that case, especially 
if one has available a high speed digital computer and the sample 
sizes are sufficiently large, one can resort to the use of an empiri¬ 
cally generated conditional distribution using the sample data. To 
illustrate the concept, let us suppose that we have available in 
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the student admission problem data on 190 individuals known to 
statistic 1 wTr° n ^ (unsuc f? sful >' Then « the value of the 

Uom enuH he tITrT P ° r ‘ he 190 ° ases ’ these ohserva- 
nons could be tabulated into a cumulated frequency distribution 

feehald toT Pl ° tted “t 3 Sm °° th *■“«"*» ZZ 

teed^d to approximate the ogive of the underlying conditional 

ifr‘~ 

S. < £T&’ -1 r *■r *™ “*•' : 

e tor the college admission problem we would have the fre- 

quency distribution and graphical reDresentatirm a* v 
Table 1 and Figure 2. 8 P representation as shown in 

Table 1. Frequency distribution for W,(z) college entrance 
problem, population tt^z) 

Interval Tall y Cum 

4.50 - 5.24 q , ™ % 

5.25 - 5.99 !9 1( » 

6 -°°- 24 ™ * 

6 7 1- ™ 28 Z % 

7.50 - 8.24 21 iye - c 

8.25- 8.99 17 ™ j® 

9.00 - 9.74 24 “ f R 

9.75-10.49 12 LL 38 

10.50-11.24 1 8 f 20 

11.25-11.99 2 a 13 

12.00-12.74 a j 03 


Figure 2. Empirical distribution of W(z) given 


^ 1.00 
I 0.90 
| 0.80 
£ 0.70 
§ 0.60 
1 0.50 
| 0.40 
| 0.30 
§ 0.20 



9-00 10.50 .12.00 13.50 

W(z) 
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A comparable empirical estimate of the distribution of W(z) 
under the condition that the observation belongs to population . 
r* be evolved through the use of the observation available ; m 
the sample from it.. The only variation m the technique wou 
b^in^he ^accumulation of tl/frequencie, In this second case one 
wrmld accumulate the frequencies with increasing W s. 
W °^us”uld have animate of P(2 11, A) wMch yidds die 
estimate of the probabihty of classifying an individualwho a 
as a tr, if one used the decision rule If W(z) > X classify the 

individual into 7T 2 . , .•<, i-Vtp 

The second problem that warrants additional mention 
multi-population problem. Here we are interested m classificati 
procedures that could classify an individual into one ° f s ® v “ 
populations, where the number of populations is i^terdia ' 
If one can associate with each population, it, q« R 

nrobabilitv of obtaining for classification an observation from 

P , . j a oost factor C(i|i), associated with mis- 

population TTi, and a cost tacror, U I /» , 

classifying an observation from tti as being from j, 

Srule is available that will minimize the expected cost of 

making classification. The rule states that: 

“If 


p 

I q.p.(z)c(k|i) 2 _ q iP t(z)c(j|i) 

k i — # „ 

for all”j(jy=k) then z should be classified into n». 

If the inequality for some indices along with k then it is im- 
maStid S to whether the individual is classified into ir k or one 
f nonulations whose index yields the equality. 

Should be noted that the practical use of these classification 
techniques will usually require the use of high speed computi g 
facilities This is especially true if the dimension of the problem 
s at aU large or if one must empirically generate the condition 
diSibuto Of the statistic being used by utilizing the uidividud 
available in the samples. There 

nroblems associated with the use of many of the ’ •„ 

ft is felt that the systematic exploration of their applicability i 
mL practical problems cannot help but advance O'> genera 
state 7 of the art. Although the discriminating power of the set 
variables currently betag accumulated can be determmed the 
Iwtensto of'the underlying distributions and the relative 
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effectiveness of the competitive procedures must in many respects 
be tackled pragmatically. Attention must be given to the problem 
of estimating both the underlying a priori probabilities associated 1 
with the populations being considered along with the misclassifi- 1 
cation cost factors Individuals may feel that such refinements are 
inappropriate to their particular classification problem, but it can 
be argued that until one addresses himself to the problem in some : 
such systematic and scientific way, no real improvement can be 
expected. The criterion of worth of any system is its operational 
ffectweness, and thus one should not only feel challenged to 
obtain estimates of the operational effectiveness of the “system” 
e is now using, but he should also investigate how the effective- ' 

by using ° ne ° f ae awe ^ : 
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3. Applications to Practical Politics 

Introductory Note 

The good Christian should beware of mathematicians . . . 

—St. Augustine 

O brave new world, that has such people in’t! 

—Shakespeare 

Andrew Hacker has suggested that the use of electromagnetic 
computers to simulate the political behavior of the real world has 
led to essentially trivial findings. A major fault, he notes, is that 
these enterprises are committee operations. "Computers . . . have 
no judgment. The sad thing is that those who are running the 
machines are themselves reluctant to exercise that quality which 
the computer lacks. One reason is that most such projects are 
team operations.” 1 Hackers example of this kind of futile activity 
is the attempt to simulate voter reaction in the 1960 presidential 
primary in Wisconsin. 

By coincidence the 1960 Wisconsin primary model is the subject 
of the first part of the article by Frank Scalora in this volume. This 
model has been criticized also on the ground that the procedure 
permits a sinister manipulation of voter information, and leads to 
thought control and to ‘"brainwashing.” This view has been ex¬ 
pressed by some prominent American political leaders, and it is 
consistent with the point of view of those traditionalist political 
philosophers who emphasize ethical and normative approaches. 
Evaluation of the Wisconsin model based on this frame of refer¬ 
ence is obviously in sharp contrast with that of Hacker, for it 
suggests that the model is important and deserves careful and 
critical attention. 

The Scalora paper deals with problems in politics and adver¬ 
tising. The model is designed to aid a candidate by furnishing 
insights which can be used to persuade voters to support him, 
and it is designed to aid a firm in persuading university gradu¬ 
ates to go to work for it. The important question is whether the 
model achieves an efficient solution to these problems. If its value 
is negligible, it will not be because it is the product of a com¬ 
mittee and utilizes a computer. 

Mathematics and Political Science, in James C. Charles worth (ed.), Mathematics 
and the Social Sciences (Philadelphia: American Academy of Political and Social 
Sciences, 1963), pp. 65-69. 
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The Scalora model offers enhanced efficiency to those who want 
to manipulate human behavior. Its successful employment is a 
legitimate subject of concern in normative terms, as well as proof 
that Hacker was in error to dismiss it lightly. The history of sci¬ 
ence is full of evidence that solutions to old problems often 
create new problems. This is an inevitable consequence of change. 
The progress of science and social change point toward com¬ 
plexity, not toward doomsday or the heavenly city. 

The article by Gerald Kramer is, like that of Scalora, designed 
to solve a problem of so-called practical politics. The aim of the 
model is the optimum allocation of scarce resources in a pre¬ 
election canvass. Output is measured in terms of votes won, or of 
votes gained. The political problem is, therefore, closely analogous 
to the resource allocation problems common to economics. Yet the 
fact that it is political is a source of specific difficulties. Votes 
won, or votes gained, are a difficult commodity to measure. The 
author, qualified in both mathematics and political science, has 
produced an imaginative model. If he solves the canvassing prob¬ 
lem, other resource allocation activities, such as television adver¬ 
tising, may become amenable to model solutions that are not pres¬ 
ently available. 

These models may be of little interest to the student of systems 
analysis or grand theory. Scalora and Kramer are concerned with 
relatively modest problems compared with war and peace, sur¬ 
vival in the nuclear age, the pursuit of justice, or the proper con¬ 
struction of nation-states. In a sense the aims of these papers are 
prediction and control, rather than understanding and extension 
of basic knowledge, and consequently the models may be less 
useful to political scholars than to the practitioners of the art of 
winning elections. On the other hand, as Havelock Ellis has ob¬ 
served: “In philosophy, it is not the attainment of the goal that 
matters, it is the things that are met with by the way.” If one 
appraises the work of political scientists in the light of the widely 
differing perceptions of what ought to be done, 2 it is evident that 
there is work aplenty for all. Some may devote themselves to great 
tasks, while others perform tasks which are immediately useful. 


2 See, for example, Charles S. Hyneman, The Study of Politics: The Present State 
of American Political Science (Urbana: University of Illinois Press, 1959) and Albert 
Somit and Joseph Tanenhaus, American Political Science: A Profile of a Discipline (New 
York: Atherton Press, 1964). 




Stochastic Models in the Behavorial 
Sciences: Applications to Elections 
and Advertising 

FRANK S. SCALORA 

IBM-World Trade Corporation 

The use of mathematics in the behavioral sciences has bene- j 
fited from the availability of modern computing machines. It is 
now possible to take a behavioral situation, put it into a mathe-; 
matical framework which simulates it, and then try it out on the 
machine. If good data are available the validity of the mathemati¬ 
cal model can be tested and then simulation runs made, increasing [ 
our understanding of the behavioral situation. It would be un-: 
realistic to attempt to do this by hand because of the difficulties | 
both of handling of data and of time availability. 

We shall discuss in this paper a common type of behavioral | 
situation and the mathematical models developed to describe two 
applications. The situation is basically that of an election, although 
it can be interpreted in many ways. 

In a political campaign, each candidate must decide which j 
issues to stress to help him win over his opponent. The advertising 
campaign for a product consists of telling the consumer about its i 
especially good qualities to the detriment of competing products. 
The recruiting manager of a company has to find out what are ' 
the most effective aspects to stress in a campaign to get the best 
people for the job. 

All of these situations involve a population and competitors who 
are trying to influence the population by communicating with 
them. We will now describe the way these situations can be put 
into a useful mathematical framework. The problem would have 
remained intractable but for the availability of computing ma¬ 
chines to store and process huge quantities of data at high speeds 
and the uniting of sociological and mathematical ideas. 

The election model discussed here was developed by a research 
group at IBM, of which the author was a member together with 
Dr. William N. McPhee. Dr. McPhee has reported on the results 
obtained when the model was used in the 1960 Wisconsin Pri- 
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mary in McPhee: Formal Theories of Mass Behavior (London: 
Free Press of Glencoe, 1963). The college recruitment model was 
developed by the author of SBC with the consulting help of Dr. 
McPhee. Earlier reports on the work have appeared in the Pro¬ 
ceedings, Eighth Annual Convention, New York, Advertising Re¬ 
search Foundation, Fall, 1962 and Robert D. Buzzell; Mathe¬ 
matical Models and Marketing Management (Boston: Division of 
Research, Harvard Business School, Boston, 1964). 

In simulating a situation of the kind discussed here, we must 
isolate the main forces which are at work and describe them 
mathematically. Thus, we have a population which is being asked 
to make a choice among various competitors as the competitors 
address messages or communications to them. We think of the 
individual member of the population as a person with precon¬ 
ceived ideas about the competitors which are, however, under¬ 
going changes because of external forces generated by the cam¬ 
paigns of the competitors and environmental forces caused by the 
people with whom he associates. The model measures the change 
in a person’s initial evaluations of the competitors in the course 
of the campaign. We will later discuss the results obtained when 
the model was applied on the simulation of a recruiting campaign. 

The population is represented in the computer by a replica, a 
representative sample of it. On the basis of answers to a question¬ 
naire, large quantities of information about the sample are stored 
in the machine. To keep the setting completely general, we will 
continue to use the terms population and competitors. The reader 
can easily translate these terms to electorate and candidates, in 
the case of an election campaign; consumers and brands of pro¬ 
ducts, in the case of a consumer product advertising campaign; 
students and companies, in the case of company recruiting cam¬ 
paigns involving college students. The reader should have little 
difficulty in thinking of other situations close to his own experi¬ 
ence which are describable in terms of the model. 

The model formalizes the above observations. It consists of 
three processes or phases which abstract the preceding remarks. 
We begin with a representative sample of the population. The 
sample is stratified into homogeneous sub-samples dependent upon 
the application in question. For example, an election grouping 
might be a socio-economic or religious-ethnic one. A consumer 
grouping might be socio-economic or some other grouping which 
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delineates among buying patterns. A college recruiting grouping 
might be on lines of major field and career interest, etc. 

It is supposed that at the beginning each person in the sample 
has given an overall rating of each of the competitors, say on ft 
scale of one-to-ten, although the scale may change from applica¬ 
tion to application. These ratings may be obtained directly from 
responses to a questionnaire as was done in the recruiting study 
previously mentioned, or indirectly through a Lazarsfeld latent 
structure analysis as was done in the election model, or possibly 
in other ways. Each competitor will then try to increase the pei 
sons evaluation of itself relative to its competitors. In real life 1 
the competitors will try to accomplish this by advertising or stresl 
smg themes which they feel are particularly favorable to them! 
The model simulates this by what we call a stimulation process. ‘ 
In the stimulation process, we pick a theme or issue for each 
competitor to stress. With Thurstone, we interpret a theme oi 1 
stimulus as a probability distribution. Thus, a candidate’s state-! 
ment on civil rights will affect Negroes somewhat differently from 
White Liberals and much differently from White Southern Con¬ 
servatives. A company’s stressing of its pre-eminence in the com-: 
putmg business will cause different reactions among students in¬ 
terested in computers and students interested in pure mathemati¬ 
cal research. Thus, the effect of a stimulus will vary depending on 1 
the group to which the person belongs. In fact, for each group we 
have computed probability tables which relate overall ratings of 1 
the competitors with ratings on the given theme or issue. Then 
the response of a person to a stimulus is obtained through a random 
process depending on the probability distribution peculiar to the ! 
person s group. We illustrate this by exhibiting a simplified “stimu¬ 
lus table, which came up in our recruiting study. This table ren- 1 
resents the theme (Issue 8), “Encouragement of Ingenuity” for 

ompany A for a group of engineering students interested in 
physics. 


Response to Issue 8 for Company A 


Very High 

Very High 

.55 

High 

.45 

Moderate 

Low 

High 

.33 

.59 

.08 


Moderate 

.20 

.60 

.20 


Low 

— 

.58 

.42 


Very Low 

— 

— 


1.00 


THE HUNT LIBRARY j 
GARNE6IE INSTITUTE OF TESJpLOtf 
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We obtained these numbers by the use of a mathematical 
formula on the basis of answers to specific questions m ‘ he 
tionnaire. There are, however, other ways of doing this. The mod 
interprets this table as follows: if a person in the group & ve 
Company A a very high overall rating, then we can expect hm. to 
give a very high response to Company As use of Issue 8 (E 
couragement of Ingenuity) 55 per cent of the time and a hig 
respoL 45 per cent of the time. If he gives Company A a^lugh 
overall rating, then he can be expected to respond very high y 
the use of this issue 33 per cent of the time, ^ghlyJS^cen 
of the time, and moderately 8 per cent of the tune etc. Thus, we 
allow the person to make a temporary respome to the themes com 
municated by the competitors by the use of these tables through 

a Monte Carlo technique. , i i • 

At this point, we assume that a person will want to check his 

responses. Here we borrow an idea due to K. Lewrn. In real life, 
a parson can check certain statements directly. For example, he 
can check the fact that a glass can be broken with a hammer y 
actually hitting a piece of glass with a hammer. In the situation 
which we are discussing, however, he will not be able to check so 
easily. For example, he cannot check objectively the statement that 
one company is better to work for than another without actually 
working for both, or that Governor Romney will make a goo 
President, etc. The next best approach is for him to ““P 31 ® 
feelings with those of a friend. The model handles this by what 
we cdl “the discussion process.” We pick a friend for the person 
from his group. If the friend confirms his impressions, then we 
assume that the person’s temporary responses to the stimidi to 
which he has been exposed are now permanent and are ready to 
become part of his experience. If the friend does not confirm his 
impressions, he will not necessarily accept his friend s I “P re ' s 'T 
over his own, but will want to rethink the situation. The model 
approximates this by exposing him to the same stimuli as before. 
We now accept his emerging responses as permanent. 

Finally, we assume that the person will incorporate his surviving 
responses into his experience. In real life situations, the person is 
making new evaluations based on past preferences and new 
stimuli in a subjective fashion. The model accomplishes this by 
what we call a “learning process.” The learning process consists 
of a formula which computes new overall evaluations of the com¬ 
petitors for the person involved. The formula “averages the per- 
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sons initial overall evaluations with the surviving responses 
emerging from the stimulation and discussion processes, subject 
to the quantity of knowledge that the person thinks he has about 
the competitors. We have previously computed these knowledga- 
bility numbers for each person. One thing implicit in the formula 
is that a person who thinks he knows a lot about the competi- 
tors wdi be harder to change than one who thinks he knows less. 

XV ! haVG aCtUa r lly described one st age, or cycle, of the model. 
We then store information about how the members of the sample 

Present 


Evaluation 



Evaluation 









116 


MATHEMATICAL APPLICATIONS 


respond to the themes or issues which are likely to be used by the 
competitors in their campaigns. At the end of a cycle die person 
may be exposed to new stimuli by the competitors. Now, how¬ 
ever he will go into the stimulation process with new overall 
evaluations. A glance at the simplified stimulus table shown above 

illustrates a point which we now make. , 

Subsequent communications by the competitors build on the 
changes in a persons attitude. Specifically, the learning process 
causes slow changes in the overall ratings m the specific direction 
toward the rating of the competitor on the specific issue being 
stressed. This improvement, or deterioration, is a double process 
because as his overall rating improves, say from high tovery high, 
then he is more likely to get a very high response to the stimulus 
than he was before. Then, in turn, that will make him improve his 
overall rating still further. Alternatively, as his overall rating de¬ 
creases, say from very high to high, then his distribution of proba¬ 
bilities is worse. He will be less likely to get a very high response 
than before, and may get some “moderate” responses, also, ihis 
will make his overall rating deteriorate still further. 

We illustrate the above remarks by the flow chart of the model 
(see p. 115). It describes one cycle of the model. 

The Election Model 

We now give the essential details of the election model. We wi 
not discuss any of its implications since these have been described 
in W N McPhee: Formal Theories of Mass Behavior (London. 
The Free Press of Glencoe, 1963). The references here are to the 
Wisconsin presidential primary in 1960. 

A. Input 

1. The Voters 

A sample of 1,783 Wisconsin voters was considered. The voters 
can be grouped in several ways. We give a possible grouping 
along religious-ethnic fines. 

Group Number Description Number * Voters 

1 German Catholic Urban 1/0 

2 German Catholic Rural 124 

3 German Lutheran Urban 109 

4 German Lutheran Rural 104 

5 German Unafliliated Urban 76 

6 German Unafliliated Rural 05 

122 
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8 

• Great Britain Unaffiliated 

61 

9 

Scandinavian Urban 

77 

10 

Scandinavian Rural 

106 

11 

Polish Urban 

82 

12 

Polish Rural 

27 

13 

Irish Catholic 

58 

14 

Other Eastern European Urban 

78 

15 

Other Eastern European Rural 

54 

16 

Other Western European Urban 

42 

17 

Other Western European Rural 

34 

18 

Other 

298 

1,783 


In addition the voters are broken down by congressional dis¬ 
tricts, of which there are ten in all. 

Each voter is given a complex of numbers at the beginning: 
INT, P H , Pk, Pr, SH, 2K, SR, SN, C, Cm, G, FRD. Definitions of 
these numbers follow: 

INT : A number designating the strength of the voter’s inter¬ 

est in his candidate of preference at the time in ques¬ 
tion (empty at beginning). 

P H : The voter’s partisanship number (overall rating) for 

the candidacy of Senator Humphrey. 

Pk : The voter’s partisanship number (overall rating) for 

the candidacy of Senator Kennedy. 

P R : The voter’s partisanship number (overall rating) for 

the candidacy of Vice President Nixon. 

These three numbers were obtained originally through a process 
called “latent structure analysis.” They are non-negative and are 
bounded by 05 and 95 except at the beginning. 

SH : A non-negative number reflecting the voter’s cumu¬ 

lative involvement with the candidacy of Senator 
Humphrey. 

SK : A non-negative number reflecting the voter’s cumu¬ 

lative involvement with the candidacy of Senator 
Kennedy. 

SR : A non-negative number reflecting the voter’s cumu¬ 

lative involvement with the candidacy of Vice Presi¬ 
dent Nixon. 

SN : A non-negative number reflecting the voter’s cumu¬ 

lative involvement with no party. 
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These four numbers are obtained from the partisanship num¬ 
bers and a number M v for each voter which is determined ac¬ 
cording to the answer to certain poll questions. 

C : The voters choice of candidate at the time in ques¬ 

tion (empty at beginning). 

C = 1. Choice for Humphrey 

2. Choice for Kennedy 

3. Choice for Nixon 

4. Non-Voting Choice 

Cm : The voter’s previous choice. 

G : The voter’s group number, e.g., 1 ^ G ^ 18 for the 

grouping which we have listed. 

FRD : The address in storage of another voter who is re¬ 

ferred to as the voter’s friend with whom he presum¬ 
ably discusses the election. 

2. The Stimulus Tables 

The voters will be subjected to certain stimuli which are pri¬ 
marily the key election issues. Given a candidate and a group of 
voters, a stimulus table is developed for each stimulus. The table 
ranks the stimulus according to intensity: 20 (very weak), 40 
(weak), 60 (neutral), 80 (strong), 100 (very strong), and proba¬ 
bility intervals going from 00 to 99. 

The stimulus table is merely a convenient way of assigning one 
of these intensity (stimulus) numbers to a voter with a certain 
probability. The probability that a particular number is chosen 
is determined by the length of its corresponding probability in¬ 
terval. To illustrate, consider the following table: 


Intensity 

20 

40 

60 

80 

100 


Corresponding Probability Interval 

00-09 

10-24 

25-44 

45-69 

70-99 


According to what was just said a voter with the above 
table will be assigned intensity number 20 with probability .10 
(09 — 00 + 1); 40 with .15 (24 — 10 + 01); 60 with .20 (44 — 
25 + 01); 80 with .25 (69 — 45 + 01); and 100 with .30 ( 99 — 
70 + 01). 


applications to practical politics 


Two sets of simulus tables were constructed, three person tables 

^ - <* «-e person stfmZ 

presupposes that each candidate wages a full campaign. Since this 
was not the case m Wisconsin, where Nixon did not campaign we 
were led to construct two person stimulus tables for Humphrey’and 
Kennedy and special tables for Nixon. The special tables Tor 
N lxo „ reflect one of three levels of campargning-No^d C a ° 
paigrung, Moderate Campaigning, and No Campaigning. 

B. Working of the Model 

The input is now subjected to three processes. 

1. The Stimulation Process 

C; ve \t Stl r m u lus for each candidate > we subject a given voter 

BlT T I he ° Ir in § Way: Three random numbers RN H RN K 
RN r (between 00 and 99) are chosen which detennine the inter" 

a m the appropriate stimulus tables, and thus uniquely deter- 

TlOO) S ^ STIMh ’ STIMk ’ STIMr ^ =20 40 ’ 60, 80, 

W t e e eom;uter ady ^ ‘° ^ Part “P ™ bers P - 
INT h =: P H -f- STIMh 
INTk = Pk + STIMk 
INT r = P R +STIM r 

If the two largest INTs are equal then we call the result a tie 
and go through the same process again until there are no ties At 

ctuimx ~ m ges ir T ,-\T“ e T d INT “' ™ ere is a 

TNT n ™] b WlU ? INT ”“ is spared. If INT„« < 

INT„,. 100 then C- -4, i.e., the voter is a non-voter (the machine 
puts 4 into the C position and INT into thp TMT \ 

Otherwise r -1 o o t mt ° the ANT position). 

O herwise, C-l, 2 or 3 accordmg as INT h , INTk, or INT, is 

INT max , and INT max becomes the new INT. 

2. The Discussion Process 

Each voter has assigned to him a friend randomly. A given voter 

berTmT C ‘fm th USSi r P , r0CeSS W " h “ “ d Cb0i - 

™berJ V ™i T rr ^ Similarl X. the friend has 

umbers INT F C F . If INT V + INT F ^ MINM-200 (minimum 

joint interest) then we say that discussion takes place. OtherX 

no discussion takes place, and both voter and friend are unchanged 

and are returned to storage, the voter voting according to INT, 
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and Cv. When discussion takes place we compare C v and C F . If: 

(D Cv=Cf then the voter and his friend are unchange . 

(2) C v =l or 2 and C„=3 or vice versa, then again the voter 

and his friend are unchanged. . 

(3) C v =l, 2, or 3 and C r =4 or vice versa then the non-voter is 

sent back through the stimulation process. 

(4) cU and C,=2 or vice versa then both voter and fnend 
are sent back through the stimulation process, but only 
through Humphrey and Kennedy stimulus tables. 

When every voter has been subjected to the discussion process 
he has final choice and interest numbers, and final C s are coun e 
up to give the vote. It is possible to take the discussion process out 
of the model and proceed to the next stage, the learning process. 

3. The Learning Process 

Every voter enters this stage with the numbers INT, C, SH, SK, 
SR 2N with the first two having come from the preceding stage. 

If C=4, 100 — INT is added to SN to get the new SN and the 

other S’s remain unchanged. _ 

If c = l 2 or 3 then INT - 100 is added respectively to SH, 5K, 
or SR to get the new SH, SK, or SR and the other S s remain un¬ 
changed. We then compute new partisanship numbers as follows. 

SH 

Ph ““ SHTSK + SR + SN 

SK _ 

Pk ““ SIT+^kTSR + SN 
SR 

Pr - SHT^K + SR + 2N 

These numbers are then rounded off so that they stay m the 
range between 05 and 95. Thus if Ph > 95 it is changed to 95 and 
if 0 < Ph < 05 it is changed to 05. These three numbers then be¬ 
come the new partisanship numbers for the voter and he is ready 
to be subjected to a new set of stimuli. 

C. Output 

The basic output consists of the following: 

1 The names of the candidates followed by the stimuli tfiat 
thev are using to stimulate the electorate at the point in question. 
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Election intentions 

This is a tabulation of the voting after the stimulation process. 
It includes the names of the candidates together with their vote 
totals and the no-vote totals, the corresponding percentages, and 

told*'f u n th f . aCtUal number of P e °Pk who have 
Example^ * ^ f ° U ° Wed by the votin g subtotals by groups. 

Election Intentions 


Humphrey 
Kennedy 
Nixon 
No Vote 

Total: 

Subtotals by Groups 

1 2 3 4 5 6 


Total 

354 

412 

392 

625 

T,783~ 


Percent 

20 

23 

22 

35 


Percent Voting 

31 

36 

34 


7 8 9 10 11 12 13 14 15 16 17 18 


Humphrey 

13 10 36 35 24 18 35 15 15 29 12 


2 2 12 10 11 6 69 


7 5 47 


5 4 94 


Kennedy 

69 58 20 20 9 15 6 9 18 9 32 11 36 28 13 
Nixon 

37 13 42 27 22 9 43 17 17 29 5 3 6 15 4 
No Vote 

57 43 71 52 21 23 38 20 27 39 33 11 14 23 27 19 19 88 

3. Election Results 

Tins fa a tabulation of the voting after the discussion process. Of 
course, if the discussion process is removed, then this appears after 
the stimulation process and replaces the Election Intentions. 

4. Election Results by Congressional Districts 

This is a tabulation of the election results percentages by con- 
gressional districts. An example follows: 8 7 
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Congressional 

Districts 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 


Percent 

Percent 

Percent 

Humphrey 

Kennedy 

Nixon 

31 

33 

36 

29 

37 

34 

33 

32 

35 

28 

39 

33 

29 

34 

37 

29 

37 

34 

32 

35 

33 

29 

42 

29 

35 

29 

36 

33 

35 

32 


5. Election Results by Groups 


This is a tabulation of the election vote percentages 

by groups. 

Example: 

Percent 

Group Humphrey 

Percent 

Kennedy 

Percent 

Nixon 

1. German Catholic Urban 

11 

58 

31 

2. German Catholic Rural 

12 

72 

16 

3. German Lutheran Urban 

37 

20 

43 

4. German Lutheran Rural 

43 

24 

33 

5. German Unaffiliated Urban 

44 

16 

40 

6. German Unaffiliated Rural 

43 

36 

21 

7. Great Britain Protestant 

42 

7 

51 

8. Great Britain Unaffiliated 

37 

22 

41 

9. Scandinavian Urban 

30 

36 

34 

10. Scandinavian Rural 

43 

13 

44 

11. Polish Urban 

24 

65 

11 

12. Polish Rural 

13 

69 

18 

13. Irish Catholic 

5 

82 

13 

14. Other Eastern European Urban 

22 

51 

27 

15. Other Eastern European Rural 

37 

48 

15 

16. Other Western European Urban 

48 

30 

22 

17. Other Western European Rural 

40 

33 

27 

18. Other 

33 

22 

45 
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The College Recruiting Model 

We now discuss the College Recruiting Model, some details 
of which have appeared in R. D. Buzzell: Mathematical Models 
and Marketing Management (Boston: Division of Research, Har¬ 
vard Business School, 1964). 

The problem we considered was one in which four companies 
were seeking the services of honor students majoring in engi¬ 
neering, mathematics, and physics. An earlier study had shown 
that in evaluating a company as a place to work, such students 
considered twelve issues to be particularly critical. These issues 
are: 

1. The company s standing in your major field of career interest. 

2. The caliber of its personnel. 

3. The opportunities it provides to do challenging work. 

4. The opportunities it provides for rapid advancement. 

5. The quality of its products or services. 

6. How hard the company drives to achieve its goals. 

7. Its special training program . . . formal courses offered, etc. 

8. The encouragement it gives individuals to use their own in¬ 
genuity in tackling problems. 

9. The amount of basic research the company undertakes. 

10. The extent to which the company is considerate of employees 
while striving for maximum profits. 

11. Starting salary. 

12. The amount of financial aid and other assistance it gives to 
help employees obtain advanced degrees. 

These twelve issues then became the stimuli to be used in the 
stimulation process in the model, the themes which the companies 
would be expected to use in their advertising. 

A sample of honor students majoring in engineering mathe¬ 
matics and physics from five universities was obtained and polled 
in January and again in May, 1961, at the close of the school year.* 

In addition, separate samples drawn from the same population 
were quizzed at intervals within this period to determine which 
communications were getting across. The samples were then 
divided into mutually exclusive sub-samples according to career 
interest. 

* se ^ ectl0n an< * Polling of the samples was done by Benton & Bowles. 
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A sample of about 250 students was considered. After rejection 
of some inadequate questionnaires, the students were grouped in 


the following way: 

Group 

Number Description 

1 Engineering students interested in computers 

2 Engineering students interested in physics 

3 Engineering students interested in systems 

4 Engineering students interested in general 

engineering 

5 Mathematics, physics students interested in 
computers 

6 Mathematics, physics students interested in 
mathematics 

7 Mathematics, physics students interested in 
physics 


No. of 

Students 

39 

49 

49 

23 

23 

22 

29 

234“ 


The students were asked to give an overall evaluation of each 
of the four companies as a place to work, and also an evaluation 
on each of the twelve issues. A scale of one to ten was used for 
each rating. On the basis of this information, we were able to 
compute the stimulus tables described below. 

Each student is represented by a vector of numbers. Here the 
subscript indicates the time stage of the game, and the superscript 
the company involved. 

In, In cj \ I^j^4; Pn <3) , l^j^4; K cj) , l<j^4; 2n ci \ l<j<4; C; G; F. 


At the beginning, n=0 and I 0 ,1 0 ci) are empty. 

P 0 ci) : The initial partisanship number (overall rating) for 

the jth company, is the students own rating of the jth 
company as a place to work. It is an integer between 
1 and 10, and was obtained from a questionnaire. 

K (P : An integer between 1 and 4 indicating how fixed the 

student’s attitude toward the jth company is. A high 
K ci) indicates the student will be more difficult to in¬ 


fluence than one with a low K ci \ 


X 0 ci) : Given by the formula So <i)r= K 0) Po <J) . 

C : The choice number. An integer from 1 to 4 indicating 

after each cycle or stage, the company the student 
would be most likely to choose if he were forced to 


make a choice then. 
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G : The group number. An integer from 1 to 7 indicating 

to which group the student belongs. 

F : The address of a friend in storage. 

In addition, each student is labeled as to whether a given 
s imulus is not so important, somewhat important, or very im¬ 
portant to him. y 

Given a company and a group of students, a stimulus table was 
ing form- d ^ StimuluS ‘ The stimulu s table takes the follow-! 


^ ' 
§4 6 




mmmmmm 

EapiEgiaiBM 

mmmmmm 



Hating of the Company on a Given Stimulus 

In the table, p y is the percentage of those students in the given 

worif Jho° h aVe T d the | iven „ com P an y “i” overall as a plafe to 
10 h ve aIso rated j on the given stimulus. Thus. 

2 I Py = 1 for every i. 

We interpret p,j as the probability that a student in the given 

ha^mted f 1 "”’ &e COmpan 5' on the § iven stimulus If he 

probabiUtv^ “ a , plaCe to WOrk - » - a conditional 
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B. The Mechanics of the Model 

The input is now subjected to the first stage of the model, which 
consists of the three phases or processes already discussed. 

1. The Stimulation Process 

Each of the four companies concerned decides on a stimu us. 
Given a stimulus for each company, we subject a student: to it m 
the following way. Four random numbers (between 0 and 1) are 
chosen, one for each company. 

The stimulus and company in question and the group t0 ™ f 
the student belongs determine the stimulus table to be used 
student’s partisanship number P. u > determines which row of the 
stimulus table is applicable. Finally, the random number picks 
out a square in that row and a corresponding intensity or strength 
of stimulus V’ (an integer from 1 to 10 obtained from the hori¬ 
zontal axis of the stimulus table). We illustrate by the fo lowing 
example. Let P. <1> = 9, and suppose the “9” row takes on the ol- 

lowingformi , fl , 4 | S | X l X | 0 ( 0 | 0 | 0 | 0 |_ 

-- 10 9 8~ 7 6 5 4 3 2 1 

Then the random number is expected to fall in the: 

“10” square 2/10’s of the time 

“ 9” square 4/10’s of the time 

“ 8” square 2/10’s of the time 

“ 7” sqaure 1/10 of the time 

“ 6” square 1/10 of the time 

and the other squares with probability 0 

Suppose that the random number falls in the 8 ’ square, then 

1^=8. 

Let I l= =l<K4 h (i) - In case there is no unique maximum, then 
we stimulate^the student again with the same stimuli. If again 
there is no unique maximum, we use a “coin tossing 5 mechanism 
to produce a unique maximum. Once we have such a maximum 
i.e., suppose I.=I,® then we say that C= j. Thus we infer tha 
if the student had to make a company choice as of the stimuli o 
the moment he would choose the jth company. We will ordi¬ 
narily stimulate a student only if the chosen stimulus is one a 
he considers somewhat important or very important to him. How¬ 
ever we may stimulate all the students if we prefer. 
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2. The Discussion Process 

Each student has assigned to him a friend chosen randomly 
within his group. The student has choice number, C(S) and tbS 

fnend choice number C(F). The rules we set up for discussiop 
are as follows: 1 

LT ??( S )=C(F) then we say that the student and friend 
agree, there is no change in the student’s numbers coming out 
of the stimulation process, and he is ready to go into the third 1 
process, the learning process. 

L 5 C ( F )> * en we hav e disagreement, and the stu¬ 

dent is asked to re-evaluate his position. He does this by being: 
restimulated with the same stimuli as before. Once restimulated § 
we allow him to go into the learning process without further dis¬ 
cussion. It is possible to take the discussion process out of the! 
model, and proceed instead directly from stimulation to learning. 
3. The Learning Process 

T( 3 t t T t S . P ° m i J each student has “interest” numbers I, a) , l«> ! 
« ’ 1 ’. m „ addition to his initial “partisanship” numbers P <j) ' ! 
cumulative numbers X 0 (j) and “mass” numbers K Cj) . In the learn’ ! 

+^? Ute neW partiSanshi P ™bers P^> as follows: j 

Then we define: 


pep 


X, 




Xo <j) + I] 


(j) 


K (j) Po^+I, 




K 0) -J-1 K (j) -J- 1 KW+ - ] ~ 

In the case in which the student has not been stimultaed by the 
Jth company, i.e., the case in which the stimulus chosen by the jth 

1S n0t im portant by the student, we define 

Pi — Po , or eqmvalently, set Ii (i) =P 0 (j3 , thus: 

K Cj) P 0 «>+p o «> 


Pi (p 


p (j) 


K Cj) +1 

At this point, the student is ready to go through the second cycle 
r stage of stimulation, discussion and learning. Since P,® need 
not be an integer, let P,® be the integer nearest P,®. Then the 

hv P p°« na H r ° W stimulus table be used is determined 

r \ ,i ! and so on - In t he case m which the student is not stimu- 
lated, we take his interest numbers from the partisanship number 
of the previous stage, i.e, we let I n ®=P„.,®. This is equivalent to 

Stbove "" ^ Can be V6rified eaSiIy ' Then we P roceed 
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After n stages, the partisanship number P n ci) is given by the 
formula: 


Pn CP = K CJ> IP = 

i = l _ 

' K cj) + n K CJ) + n 


C. Output 

For each student we will be able to read his partisanship num¬ 
ber after each stage, e.g.: 

P <» Pn (2) . . . . • Pl W> , P®, • • * ;.; Pn , Pn . 

Next, we will’get, after each stage, each company’s average 

partisanship number: 


^ “ 234 m=1 

where P nm (j) is the partisanship number of the mth student for the 
jth company after the nth stage and where 234 is the sample size. 

Finally, we will get the distribution for each company of t ^e 
P n c>’s after each stage. For this purpose, we round oft the JA s 
to the nearest integer and express the number of them m eac 
box by a percentage of the total. We also express these kinds ot 

output by student group. 


D. Description of the Experiment 

The experiment consisted of two parts. In the first part, we set 
out to simulate in the model the actual advertising campaigns 
which took place during the period in question. This turned out 
to be rather difficult, for various reasons. For one, it was cus¬ 
tomary to use several of the critical issues in one advertisement. 
All of the companies were found to be using most of the critical 
issues in their advertising. Another difficulty was that it was ex¬ 
tremely hard, if not impossible, to estimate company recruitment 
budgets, media mixes, and the exact correspondence between 
frequency of advertising messages and number of cycles through 
which the message should be used in the model. 

Because of this, we devised the idea of an "average issue or 
stimulus. The stimulus tables for the average issue were computed 
by averaging out the corresponding entries in the stimulus tables 
for all of the twelve issues. For example, the average stimulus 


H 

r 
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table for Company 1 and Group 1 was obtained by averaging out 
the corresponding entries in the Company 1-Group 1 stimulus 
tables for all of the twelve issues. We then used their average 
stimulus tables through ten cycles of the model and compared the 
resu ts with the evaluations given in the second questionnaire. We 
give the results by group and by totals. 

Group ! Company 1 Company 11 Company 111 Company IV 

Questionnaire! 8.54 7.08 8.67 723 

Questionnaire II 8.72 7.49 8.41 7 31 

Model Prediction 8.69 7.59 «ao n aA 


Group 2 Company 1 

Questionnaire I 8.57 

Questionnaire II 8.94 

Model Prediction 8.94 

Group 3 Company 1 

Questionnaire I 8.43 

Questionnaire II 8.53 

Model Prediction 8.80 

Group 4 Company 1 

Questionnaire I 7.96 

Questionnaire II 8.83 

Model Prediction 8.73 

Group 5 Company 1 

Questionnaire I 8.43 

Questionnaire II 8.70 

Model Prediction 8.82 

Group 6 Company 1 

Questionnaire I 8.14 

Questionnaire II 8.59 

Model Prediction 8.82 

Group 7 Company I 

Questionnaire I 8.76 

Questionnaire II 9:00 

Model Prediction 9.10 

Total Company 1 

Questionnaire I 8:44 

Questionnaire II 8.76 

Model Prediction 8.85 


Company 11 

7.08 

7.49 

7.59 

Company II 

7.14 

7.41 

7.74 

Company 11 

7.78 

7.73 

7.92 

Company II 

7.57 

7.83 

7.83 

Company II 

7.52 

7.65 

7.68 

Company 11 

7.23 

7.27 

7.83 

Company 11 

7.45 

7.52 

7.86 

Company 11 

7.39 

7.56 

7.78 


C ompany Ill Company IV 

8.67 7.23 

8.41 7.31 

8.62 7.64 

C ompany Ill Company IV 

7.10 8.04 

7.51 7.55 

8.25 8.22 

C ompany 111 Company IV 

7.86 7.82 

7.92 7.41 

8.45 7.92 

Company III Company IV 

7.78 7.74 

8.30 7.48 

8.54 7.74 

C ompany 111 Company IV 

8.39 7.26 

8.52 7.39 

8.50 7.61 

C ompany 111 Company IV 

7.73 7.18 

8.18 7.18 
8.43 7.73 

C ompany Ill Company TV 

7.14 6.69 

7.28 7.21 

8.27 7.20 

C ompany 111 Company IV 

7.78 7.50 

7.96 7.88 

8.43 7.78 
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As can be easily verified, the model prediction falls well within 
the most stringent K 2 goodness of fit criteria. We achieved even 
better results by running each of the twelve issues ten times and 
averaging the results, assuming that only students who thought 
the issue important would make a response to it. The results in this 
case follow. 


Group 1 Company I 

Questionnaire I 8.54 
Questionnaire II 8.72 
Model Prediction 8.71 

Group 2 Company I 

Questionnaire I 8.57 

Questionnaire II 8.94 

Model Prediction 8.79 

Group 3 Company I 

Questionnaire I 8.43 

Questionnaire II 8.53 

Model Prediction 8.68 

Group 4 Company I 

Questionnaire I 7.96 

Questionnaire II 8.83 

Model Prediction 8.42 

Group 5 Company I 

Questionnaire I 8.43 

Questionnaire II 8.70 

Model Prediction 8.68 

Group 6 Company I 

Questionnaire I 8.14 

Questionnaire II 8.59 

Model Prediction 8.69 

Group 7 Company I 

Questionnaire I 8.76 

Questionnaire II 9:00 

Model Prediction 8.98 

Total Company 1 

Questionnaire I 8:44 

Questionnaire II 8.76 

Model Prediction 8.72 


Company II 

7.08 

Company III 

8.67 

Company TV 

7.23 

7.49 

8.41 

7.31 

7.43 

8.62 

7.57 

Company II 

7.14 

Company III 

7.10 

Company IV 

8.04 

7.41 

7.51 

7.55 

7.48 

7.97 

8.15 

Company II 

7.78 

Company III 

7.86 

Company IV 

7.82 

7.73 

7.92 

7.41 

7.83 

8.20 

7.98 

Company II 

7.57 

Company III 

7.78 

Company IV 

7.74 

7.83 

8.30 

7.48 

7.79 

8.23 

7.84 

Company II 

7.52 

Company III 

8.39 

Company TV 

7.26 

7.65 

8.52 

7.39 

7.67 

8.50 

7.65 

Company II 

7.23 

Company III 

7.73 

Company IV 

7.18 

7.27 

8.18 

7.18 

7.55 

8.24 

7.62 

Company II 

7.45 

Company III 

7.14 

Company TV 

6.69 

7.52 

7.28 

7.21 

7.62 

7.92 

7.31 

Company II 

7.39 

Company 111 

7.78 

Company IV 

7.50 

7.56 

7.96 

7.38 

7.62 

8.23 

7.79 
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In the second part of the experiment, we took each issue and 
ran it through the model ten times for each company. Thus, we 
started with Issue 1, “the company’s standing in your field of 
career interest,” and we let each of the companies use this issue 
through ten cycles of the model. Then we started anew with Issue 
2 and did the same thing. And so on, through Issue 12. 

We did this part of the experiment in two ways. In the first 
way, we let each of the students be stimulated by the given 
issues. In the second way, we stimulated a student with an issue 
if he felt the issue was important to him. We will summarize some 
of the results. 

First of all, we defined an issue to be a good one for a given 
company if through the use of it the company’s average overall 
rating went up more than 5 per cent. We defined the issue as 
being a better one for a company than for another company if 
its percentage rise was greater than that of the company. 

Thus, in comparing Company 3 with Company 1, its main com¬ 
petitor, we found the following breakdown useful. 



Company 3 Better 
Than Company 1 

Company 3 Not as 
Good as Company 1 

Good 

Issues 

Quality of products 
Standing in field of 
career interest 

Challenging work 
Caliber of personnel 


Drive to achieve goals 

Basic research 


Starting Salary 

Encouragement of 
Ingenuity 

Poor 

Training Program 

Aid to Education 

Issues 

Consideration of 
Employees 



Rapid Advancement 



Business Image Scientific Image 


Company 

Issues 


Individual 

Issues 


Thus, Company 3’s good issues are “quality of products, stand¬ 
ing in field of career interest, drive to achieve goals, challenging 
work, caliber of personnel, basic research,” which are basically 
issues relating to how good a company it is. Its poor issues are 
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“starting salary, training program, consideration of employees, 
rapid advancement, encouragement of ingenuity, aid to education” 
which are issues relating to how good a company it is for the indi¬ 
vidual. 

Also, the issues in which Company 3 is better than Company 1 
are “quality of products, standing in field of career interest, drive 
to achieve goals, starting salary, training programs, consideration 
of employees, and rapid advancement.” which are issues having to 
do with its business image. But the issues in which Company 3 
does not do as well as Company 1 are “challenging work, caliber 
of personnel, basic research, encouragement of ingenuity, aid to 
education” whcih are issues having to do with its scientific image. 

In its advertising against Company 1, Company 3 can best afford 
to stress the issues in the upper left-hand box: “quality of product, 
standing in field of career interest, drive to achieve goals.” The 
issues in the upper right-hand box, “challenging work, caliber of 
personnel, basic research” are also good issues for Company 3. 
However, they represent a “trap” for it, since Company 1 is stronger 
than they are in those issues. In fact, the basic research issues 
have so been pre-empted by Company 1 that Company 3’s use of 
them calls attention to Company l’s excellence in them. 

The issues in the lower left-hand box; “starting salary, training 
program, consideration of employees, rapid advancement,” are not 
particularly good for Company 3 but Company 1 is no better, so 
that stressing these issues causes small gains for Company 3 as 
opposed to Company 1. Finally, the issues in the lower right-hand 
box, “encouragement of ingenuity and aid to education,” are bad 
ones for Company 3 in which Company 1 is better. These issues 
represent a product problem for Company 3 in relation to Com¬ 
pany 1. They are issues in which Company 3 will have to make 
basic changes before using them. 

If we compare Company 3 with all of its 3 competitors, we get 
the following (see table, p. 133): 

The numbers in parentheses represent the companies that are 
better than Company 3. 

Here it is seen that there are three issues which are good ones 
for Company 3 in which it does better than all of its competitors, 
namely: “quality of products, standing in field of career interest, 
drive to achieve goals.” Those are obviously Company 3’s best 
opportunities for improving its standing against all its competi¬ 
tors. 
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Company 3 Better 
Than Competitors 

Company 3 Not as Good 
as Competitors 


Quality of Products 

Challenging work (1) 

Good 

Standing in field of 

Caliber of Personnel (1) 

Issues 

Career Interest 


Drive to Achieve Goals 

Basic Research (1) 

Poor 

Starting Salary 

Encouragement of 

Ingenuity (1, 4) 

Aid to Education (1, 2, 4) 

Issues 


Training Program (2) 

Consideration of 

Employees (2, 4) 

Rapid Advancement (2) 


The issues in the upper right-hand box, “challenging work, cali¬ 
ber of personnel, basic research,” are good ones for Company 3, 
but it loses ground to Company 1, when both are using them. 
However, it does gain on Companies 2 and 4. Company 1 seems to 
have pre-empted those issues. 

In the lower left-hand box, there is the issue, “starting salary,” 
which is not a particularly good one for Company 3; however, it i 
does gain on all of its competitors through the use of it. This repre- 
sents an area for possible education of the population. 

Finally, in the lower right-hand box are issues which are not so j 
good for Company 3, and in which some competitor does better. 

In fact, the issue, “encouragement of ingenuity,” is one in which 1 
both Company 1 and Company 4 do better. AH three competitors 
do better on aid to education.” Companies 2 and 4 do better on 
“consideration of employees,” Company 2 does better on “training 
program” and “rapid advancement.” 

Let us see how the issues divide up for the other three compa¬ 
nies. r « 
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Company 1 


Company 1 Better Some Competitor Better 

Than Competitors Than Company 1 


Good 

Issues 


Poor 

Issues 


Challenging Work 

Basic Research 

Encouragmeent of 
Ingenuity 

Caliber of Personnel 

Quality of Products (3,4) 


Aids to Education (2, 4) 

Drive to Achieve Goals (2, 3, 4) 
Training Programs (2, 3, 4) 

Standing in Field of Career 
Interest (2,3,4) 

Starting Salary (2, 3, 4) 

Consideration of Employees 
(2, 3,4) 

Rapid Advancement (2, 3, 4) 


This shows that Company l’s good issues are challenging work, 
basic research, encouragement of ingenuity, calibre of personnel, 
and quality of products.” Of those, the first four are issues in which 
it does better than all of its competitors. In the last issues, quality 
of products,” there are two companies, Companies 3 and 4, which 
do better. Its good issues tend to be relating to how good a scien¬ 
tific company it is. 

Company l’s poor issues are “aid to education, drive to achieve 
goals, training program, standing in field of career interest, start¬ 
ing salary, consideration of employees, and rapid advancement. 
Oddly enough, Company 1 does not top any of its competitors in 
any of these issues. In fact, it is topped by all of its competitors in 
all of the issues except for “aid to education where Company 2 
and 4 do better. Company l’s poor issues tend to be issues having 
to do with its business characteristics and its dealings with the in¬ 
dividual. 
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Good 

Issues 


Poor 

Issues 


Company 2 


Company 2 Better Some Competitor Better 

Than Competitors Than Company 2 


Aid to Education 

Training Programs 

Quality of Products (1,3,4) 

Rapid Advancement 

Drive to Achieve Goals (3,4) 

Caliber of Personnel (1,3) 

Encouragement of Ingenuity 
(1, 3,4) 

Basic Research (1,3) 

Challenging Work (1, 3, 4) 

Standing in Field of Career 
Interest (3,4) 

Starting Salary (3,4) 

Consideration of Employees 
(3,4) 


Company 2’s good issues are “aid to education, training p ro- j 
gram, and quality of products.” Of these, it is better than all of its j 
competitors in the first two. In the third one, “quality of products, 
it is excelled by all three of its competitors. 

Company 2’s poor issues are “rapid advancement, drive to 
achieve goals, caliber of personnel, encouragement of ingenuity,: 
basic research, challenging work, standing in field of career inter-| 
est, starting salary, and consideration of employees. However, the 
first of these, “rapid advancement,” is one in which it does better 
than its competitors. This gives it a possible issue to use in an edu¬ 
cational way. In the remaining issues, there are at least two com¬ 
petitors that do better in each. 

Company 2’s best issues seem to deal with the help it gives the 
individual to improve himself. 
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Finally, let us look at Company 4. 


Company 4 

Company 4 Better Some Competitor Better 

Than Competitors Than Company 4 


Good 

Issues 


Boor 

Issues 



Quality of Products (3) 

Encouragement of Ingenuity 
(1) 

Drive to Achieve Goals (3) 

Standing in Field of Career 
Interest (3) 

Aid to Education (2) 

Consideration of 
Employees 

Challenging Work (1,3) 

Caliber of Personnel (1, 2, 3) 

Basic Research (1, 2, 3) 

Starting Salary (3) 

Training Programs (2,3) 

Rapid Advancement (2, 3) 


Company 4’s good issues are “quality of products, encourage¬ 
ment of ingenuity, drive to achieve goals, standing in field of 
career interest and aid to education.” However, in each of these 
there is a competitor which does better. 

Its poor issues are “consideration of employees, challenging 
work, caliber of personnel, basic research, starting salary, training 
programs, and rapid advancement.” Of these, Company 4 does 
better than all of its competitors in the issue, “consideration of 
employees,” and worse than some competitor in the others. 

The reader-will note that the issue, “quality of products” is a 
good one for all four companies, while the issues, “rapid advance¬ 
ment and starting salary are poor ones for all of the companies. 



A Decision-Theoretic Analysis of a 
Problem in Political Campaigning 

GERALD H. KRAMER 

University of Rochester 

1.1 In the past two decades, the use of quantitative methods as 
aids for decision-making has become common in many fields, par¬ 
ticularly those involving industrial and military operations. More 
recently, efforts have been made to apply these methods to other 
governmental activities. 1 By and large, however, these efforts have 
not been made by political scientists, nor have the methods em¬ 
ployed, despite their increasing sophistication and power, had 
great impact upon the discipline. This is unfortunate, for many of 
the traditional concerns of political scientists appear to be quite 
susceptible to this sort of analysis. In this paper, we will attempt 
to show how such a quantitative decision-theoretic approach 
might be used to analyze a practical political problem, namely the 
problem of conducting a door-to-door canvass of voters, for parti¬ 
san campaign purposes. 

Such a demonstration may be of interest for two reasons. First, 
it may lead to results which are of substantive or practical interest 
to the student of political campaigning. In the course of our an¬ 
alysis we will suggest some rough rules of thumb and then develop 
a systematic optimization procedure for efficient canvassing; we 
will also offer some tentative conclusions concerning the relative 
efficiencies of several simpler canvassing strategies, and indicate 
the relevance of our findings to other campaign problems. 

The demonstration may also be of broader methodological in¬ 
terest. In political science there has been considerable debate and 
discussion as to whether certain concepts can be quantified, or 
certain problems studied quantitatively. In fact, there is no reason 
to doubt that quantitative research—of some kind —can be done on 
almost any problem; the only interesting questions are whether it 

For examples, see R. N. McKean, Efficiency in Government Through Systems 
Analysis (New York: Wiley, 1958); H. G. Schaller, ed.. Public Expenditure Decisions 
in the Urban Community (Baltimore: Johns Hopkins Press, 1963); and chaps, vii 
and xin of D. B. Hertz and R. T. Eddison, eds., Progress in Operations Research Vol 
II (New York: Wiley, 1964). 


137 








138 


MATHEMATICAL APPLICATIONS 


should be done, and how. But these questions—as R. L. Ackoff, for 
example, convincingly argues 2 —cannot be satisfactorily understood 
except by examining them in the context of the uses to which the 
research results are ultimately to be put. We will not be specific¬ 
ally concerned with methodological questions here. Nevertheless, 
by focusing explicitly on the question of uses, and by showing one 
way in which quantitative empirical results can be applied to 
solve a specific problem, we may at least be able to indicate a 
perspective, by means of which some of the methodological issues 
of quantitative empirical research may come to be better under¬ 
stood. 

1.2 The organization of this study is as follows: in section 2 we 
formulate the overall problem of resource allocation in political 
campaigning, within a general decision-theoretic framework. We 
then narrow the focus to canvassing, and in section 3 develop a 
simple quantitative model of a political canvass. In section 4 we 
describe a general technique, based upon the model, which sys¬ 
tematically discovers the optimal allocation of canvassing effort, in 
any constituency and for any budget size. We also describe some 
simpler canvassing strategies, and then demonstrate and compare 
all of these approaches by applying each to a hypothetical con¬ 
stituency. This analysis is based upon a number of simplifying 
assumptions; in section 5 we explore the question of how our con¬ 
clusions are affected when these simplifying assumptions are re¬ 
laxed, in various ways. Finally, some brief concluding remarks are 
offered in section 6. 

2.1 In general terms, we can describe a decision problem as fol¬ 
lows: we have a decision-maker, who is confronted with a set of 
alternative, mutually exclusive courses of action, and who is inter¬ 
ested in attaining certain possibly conflicting goals or objectives. 
The available alternatives are related to the objectives, perhaps in 
complex and uncertain ways; the decision-maker’s problem is to 
select that alternative which is “best” in terms of his goals. Quanti¬ 
tative analysis of such a problem requires that we provide a con¬ 
cise description of the problem, a precise criterion of “best,” and 
finally, a systematic way to use this information to discover which 
alternative is in fact best. A comprehensive solution to the overall 

2 R. L. Ackoff, S.K. Gupta, and J. Si Minas, Scientific Method; Optimising Applied 
Research Decisions (New York: Wiley, 1962). 
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problem of conducting a political campaign is hardly feasible at 
present However, as a first step toward that ultimate goal, and 
also as background for the more detailed treatment of canvassing 
m sections 3 to 5 let us briefly attempt a preliminary formulation 
of the overall problem. 

2.2 The range of alternatives confronting a candidate running 
for office is truly enormous. Among the subjects dealt with in one 
well-known campaign manual, for example, are the following: 
registration drives, mail campaigns, house-to-house canvassing, : 
bumper sticker campaigns, special group activities, coffee parties, 
arger receptions, plant visits, sound trucks, meetings and debates, 
television, telephone campaigns, voter transportation, and poll 
watc ing. Each such activity can be carried out in a variety of 
ways, and the purpose of a campaign manual is presumably to de¬ 
scribe some of the more efficient ways. 

In addition to these various tactical questions, there is also the 
broader strategic question of deciding between activities. If our 
resources are limited, then to increase the scale of one activity 
(e g., to make more use of TV) means we must cut back on some 
ot er activity (e.g., plan a smaller canvass); somehow, we must 
decide which activities to increase and which to cut back, in order ! 
to achieve a balanced overall campaign strategy. Let us suppose 
that there are n di stinct activities, and that for each we have a 
quantitative measure of the overall level of the activity-e.g., so 
many man-hours of canvassing, or hours of TV, etc. Then we can 
concisely represent any campaign strategy by its activity levels ! 
Ai, X n . The set of all possible strategies is the set of all such 
n-tuples, and the set of available strategies is the subset of such 
n-tuples which are feasible in terms of the resources available, 
liius, if the only resource which is limited is money, and if the i* 
RchvHy costs Ci dollars per unit, i=l, 2 ,.., n, then the set of avail¬ 
able strategies is set of n-tuples which satisfy 

xax^B, 

i = l 

where B is the maximum possible campaign budget, in dollars. 
The campaign problem is to determine which of these n-tuples is 
best, according to some well-defined criterion of “best.” 

7964). Campaign Mmual 1964 < Washin S ton > D- C.: Democratic National 
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2 3 Just as there are many ways of running a campaign, so also 
there is a variety of possible goals which a candidate may be pur¬ 
suing. No doubt most candidates are interested m winning the 
election. Even so, a third-party candidate, for example, may have 
no real hope of winning, and may therefore gear his campaign 
strategy to other goals, such as getting his “message across, or 
depriving one of the major-party candidates of votes. Even a 
serious contender for office may place great stress upon factors 
other than success, such as “educating” his constituents, whatever 
the electoral consequences. But however important such goals 
may be in specific instances, if a general analysis is to proceed we 
must concentrate upon the major and most tangible of the goals. 
For most candidates in most contests this goal is clearly to win. 

Political campaigning is an uncertain business, in which no cam¬ 
paign strategy can guarantee victory. Thus one plausible quanti¬ 
tative translation of the goal of winning, which takes this uncer¬ 
tainty into account, is that the candidate wishes, in selecting his 
campaign strategy, to maximize his probability of winning. An 
alternative, though related, formulation is that the candidate 
wishes to maximize the size of his plurality (or more precisely, 
since uncertainty is present, his expected plurality E (n A -n B ) 
where n A and n B are the votes cast for A and B, respectively, and 
where E is the expected-value operator. 4 ) 

With the usual electoral arrangements, winning is normally 
closely related to the size of the candidates plurality. However, 
these two formulations of the candidate’s goal, though related, may 
nevertheless lead to differing recommendations when used to 
assess the value of alternative campaign tactics. This is particularly 
likely if one of the tactics is very risky, but also potentially very 
productive. For example, suppose the choice facing the candidate 
is between adopting such a tactic (b), versus continuing present 
tactics (t 2 ); moreover, suppose the candidate now has 55% of the 
votes and Ud.ll maintain this lead for sure with t,. Tactic b, on the 
other hand, will either gain another 40% or lose 10% of the vote, each 
with probability .5. If his goal is to maximize his expected plurality, 
the proper choice is to adopt b, since his expected plurality is 40% 

4 The expected value of a function is its average, defined by 
E(f) = % xPr(f=x). 

See, e.g., J. G. Kemeny, et. al, Finite Mathematical Structures (Englewood Cliffs, N. J-: 
Prentice-Hall, 1961). 
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then versus 10% with t 2 . On the other hand, choosing ti over t 2 re¬ 
duces the probability of winning from 1.0 to .5, and therefore if the 
goal is to maximize his probability of winning, exactly the opposite 
recommendation is in order. Other things being equal, presumably 
most candidates would take the more conservative course and 
adopt t 2 ; in that sense, the probabilistic objective is the more 
realistic. 

However, this formulation is computationally quite difficult to 
work with; in practical applications one would have to resort to 
simulation techniques which are expensive and often cumbersome. 
The expected-plurality criterion is much simpler in this respect, 
and possesses the convenient property that if we can evaluate the 
candidate’s expected plurality in each of several subunits (e.g., 
precincts) in his constituency, then his overall plurality can be 
obtained by simple summation. Clearly this is not true of the prob¬ 
abilistic criterion. Moreover, the expected-plurality criterion is 
more easily comprehended and communicated, since campaigners 
traditionally think in terms of so many votes gained or lost, and 
the criterion translates directly into these terms. Either formula¬ 
tion provides us with a reasonable, quantitative value criterion; 
however, in subsequent discussion we shall employ the expected- 
plurality criterion. 

2.4 Suppose the following: that we have settled on a value cri¬ 
terion V; that, after extensive empirical analysis, we are able to 
predict what level of V will result from implementing any partic¬ 
ular campaign strategy Xj,X 2 , ...,X n ; that activity i, i=l, 2, ..., n, 
costs Ci dollars per unit (man-hour, TV-minute, or whatnot); and 
that the total cost of whatever strategy we adopt shall not exceed 
our budget B. The candidate wishes to find the best feasible strat¬ 
egy; thus the overall campaign problem is to find X lf ..., X n such 
that 

V (Xi,..., X n ) = maximum, 

subject to (2.1) 

2CiXi < B. 

Under certain reasonably general assumptions about the function 
V (Xi, X 2 ,..., X n ), 5 the following will be true: at the maximum, the 

5 Specifically we assume V to be continuous, increasing, and concave, and every 
activity to be sufficiently productive so that the problem of corner maxima does not 
arise. See, e.g., the Appendix, "The Simple Mathematics of Maximization,” in C. J. 
Hitch and R. N. McKean, The Economics of Defense in the Nuclear Age (Cambridge: 
Harvard University Press, 1961). 
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marginal increase in V produced by spending an additional dollar 
on any activity must equal that produced in any other activity. 
Conversely, if the marginal increase in some activity i is less than 
that in another activity j, then clearly we can obtain a better 
strategy by reallocating funds from i to j (unless the allocation to 
i is already zero dollars, in which case further reallocation is im¬ 
possible). These marginal increases, or marginal productivities, 
play an important role in discovering and verifying a solution to 
(2.1). Hence in studying campaigning, or any of the various cam¬ 
paign activities, one important aim of the analysis is to provide a 
basis for calculating these marginal productivities. 

3.1 To demonstrate how such an analysis might proceed, we will 
examine one of these activities in greater detail. The problem 
which we consider is that of conducting a precinct-level door-to- 
door canvass of voters during a campaign, in order to pass out 
literature, reinforce the faithful and convert the opposition, and so 
on. In conducting a drive of this sort there are a number of choices 
to be faced, concerning which areas of the constituency shall be 
canvassed, what type of literature and of approach shall be em¬ 
ployed, which routes shall be assigned to which workers, and so 
forth. Here, we consider only the two broadest problems, concern¬ 
ing the choice of localities and of “tactics,” a term to be defined 
below. 

Conducting a canvass requires the expenditure of various kinds 
of resources, such as labor, printed materials, etc. We assume that 
it is always possible to obtain additional quantities of any of these 
resources, at fixed costs, if necessary; hence the only resource 
limitation we need consider is the overall budget constraint. A 
canvassing budget of given size can be employed in a variety of 
ways, producing a variety of different effects. Our problem here is 
to determine—or more accurately, to obtain a method for deter¬ 
mining—which of these possible ways is “best,” in terms of the 
expected plurality produced. As a first step in this endeavor, we 
proceed now to construct a model of a political canvass, with 
which we can assess the effects of alternative canvassing strategies. 

3.2 By a model of a canvass, we mean a symbolic representation 
of the process, which can be manipulated for predictive purposes. 
The elements of our model are the following: we assume the 
electorate to be partioned into a number k of small, relatively 
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homogeneous units such as precincts or voting districts In the i th 
precinct, let 

N 1 be the number of residents, 

Pr1 be the fraction of registered voters, of whom 

Pv actually vote, and 

Pa and P B prefer parties A and B respectively. 

Where we are speaking of a single precinct and no ambiguity 
will result, we will usually omit the superscripts. Notice that, as of 
the time of the canvass, the quantities P A , P B , P v are predicted 
rather than actual values; they are forecasts of what will happen 
several days or weeks hence, on election day. These predictions 
need not be extremely accurate; extrapolation from past compar¬ 
able elections would suffice. We assume that it is possible to make 
these predictions about each precinct, at negligible cost. We also 
assume that it is possible, though not necessarily inexpensive, to 
determine for individual voters within the district whether they 
are registered, and which party they prefer. In a well-organized 
precinct, this is the type of information which would be contained 
m the party s card file; in an unorganized precinct, it might be 
possible to use official registration data (where partisan registra¬ 
tion is in effect), or it might be necessary to conduct a precam¬ 
paign canvass. Again, this information on individual voters need 
not be perfectly accurate, though for simplicity we will assume, 
initially, that it is. We also assume that in any single homogeneous 
precmct, turnout and partisanship are statistically independent. 

Next, we assume that within any single precinct there are two 
basic tactics available to the party conducting the canvass. In the 
first, which we will refer to as a “blind” canvass, the party sys¬ 
tematically contacts every person in the precinct, irrespective of 
registration and partisanship. The second tactic is a “selective” 
canvass, in which only registered partisans of the party are con¬ 
tacted. Clearly there are other possibilities as well, such as con¬ 
tacting all registered voters or attempting to contact only the 
habitual nonvoters within the precinct. Here, however, we will 
consider only the two representative tactics described above. 

Our model must finally take into account the response of the 
individual voters in the constituency to contact by a party worker. 
Many different kinds of effect are possible, e.g., upon the voters’ 
motivations and attitudes, his knowledge, or his subsequent be¬ 
havior. For our purposes, however, only those responses which 
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affect our candidate’s plurality in the forthcoming election are 
immediately relevant; hence, we ignore the various possible psy¬ 
chological effects and confine ourselves to the question of how 
partisan contact affects the recipient’s subsequent voting behavior. 
It is useful to distinguish between two possible types of voting- 
behavior effect, which we shall refer to as preference and turnout 
effects. By a preference effect, we mean any alteration in a voter s 
candidate preference-or, more precisely, in the probability that, 
if he votes, he will cast his ballot for a given candidate. By a 
turnout effect, we mean any alteration in the probability that he 
will vote at all, for either candidate. 

Obviously, the questions of the existence and of the magnitudes 
of these effects are empirical questions, and can only be settled by 
empirical investigation. In fact, several such investigations have 
been performed by various researchers. It would be too much of a 
diversion, here, to review the methods and results of each of these 
studies; in summary, however, they seem to show the following: 
Preference effects, in contested partisan elections, are small and 
statistically insignificant in magnitude, and do not follow any con¬ 
sistent pattern in direction. Henceforth we will ignore them. Siz¬ 
able turnout effects, however, do apparently exist. These effects 
are positive, in the sense of increasing (rather than decreasing) 
turnout probabilities, and for practical purposes their magnitudes 
can be taken to be independent of such factors as the partisanship 
of the contact or of the recipient, or the level of the office being 
contested. 6 For our purposes, of course, we need a precise and 
quantitative description of these effects. The following simple 
model is convenient to work with and has proven to be a realistic 
formulation empirically: _ 

Pr(V|C) =Pr(V|C) + a[l — Pr(V|C)] (3.1) 

Here, Pr(V|C) is the probability of voting in the absence of con¬ 
tact, Pr(V|C) is the probability of voting after having been con¬ 
tacted, and a is a paramenter. In terms of relative frequencies, the 
model asserts that if a large group of voters is canvassed, then the 
final turnout rate will equal the precontact rate Pr(V|C), plus a 
certain fraction a of that portion of the group which would not 
otherwise have voted. That is, the rate (or probability) of non- 


6 For a summary of most of the available evidence, see G. H. Kramer, 
Theoretic Analysis of Canvassing and Other Precinct-Level Activities in 
Campaigning” (Doctoral dissertation; MIT, 1965), chaps, iii and iv. 
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voting IS reduced by a constant fraction a. Evidently the nre 
contact turnout probabilities (and therefore also the postconSrt 
probabilities) can vary from voter to voter, or prednct to p“ 

cXT’t“ 15 < 5° nStant f ° r aU V °* erS 3,111 for P«cincts. P Empiri- 
cally, a typical or average value of a is .4.’ (Clearly we P are 

iT ^ ° f regiS ‘ ered VOters ’ since b » th the nre- and 
postcontact turnout probabilities of unregfrtered voters are LTys 

3.3 The value cntenon which we wish to maximize is the candi 
dates constituency-wide expected plurality. This overall pluralitv 
equals the sum of the candidate’s sub-pluralities in each precinct 
ence let us initially consider the effects of our tactics fra single 
precinct. If we use P v as the value of Pr(V|C) and P as th 

dom from tht VOting f ° r Candidate A - for a voter drawn* at ran- 

PWI^:^ - expected 

PaPvPeN - P b P v PkN= (P A — P B ) P v P r n 

If we assume a pure two-party system where P 4 _ p -1 a. 
this reduces to 7 ’ Pa + Pb1 > then 

(Pa - [1 - P A ]) P v P r N = (2 P a - 1) p vPrN , 

is ai°Woft PP ° Se that the j ntire P recinct is canvassed blindly-that 

effect of s T C ° ntaCted ’ ° f “o- Evidendy dS 

ettect of such a canvass is to increase th* __ . \ i , 

somewhat, according to turn ° ut probabi % 

Pr( v |C) =Pr(V|C) q:[I __ Pr( VIC)] 

= Pv -f- a:(l — p v ) 1 

flence the expected plurality becomes 
[2 Pa - 1] [P v -f a(l - P v ) JP r N, 

the \^2f i s additi ° n t0 the Candidate ' S 1*“% from 

PvU2P a - P l]pfN N ~ [2PA ~ 1]PvPrN 

A/r (3.3) 

™tes g eXted W N C r T "“^er of 

voters contacted, N, to obtain the net votes mined per contact 

(or productivity per contact, Pl ) for a blind canvass: ? 

p.=a(l-P v ) (2 P a -1) P „ 

Now suppose that a selective canvass is used, in which'only 

7 Ibid., especially pp. 72 - 75 . 
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registered A-partisans are contacted. After the canvass, the ex- 
pected plurality is evidently 

= a[l - Pv]P»P»N + (2Pa - l)PvP»N. , 

By subtracting (3.2) we obtain the add t0 .^ 
plurality produced by the canvass, and by vi mg Y 

number, P A P R N, of voters contacted we have the per-con p 
ductivity for a selective canvass, (3 5) 

Nrte that p„ inlike p„ can become negative because of the 
(2 P a — 1) term; in neighborhoods where * e 
minority support, bHnd canvassing is counterproductive. Both ex 
pressions Contain a (1 - PO term, and hence either type o cm, 
vass is relatively more effective in low-turnout neighborhoods, and 

also in off-year elections. , 

Finally, let c, and c. be the costs per contact of conducting bhnd 
and selective canvasses, respectively, where Q '> ^ 

As expenditure on either tactic increases evidently 
plurality increases initially with slope p,/c, or p,/c, Evenh. y, 
when all suitable voters in the precinct have been contactedfa 
increases in expenditure produce no additional gams m plurality. 
Graphically, the overall gain-cost relations are as shown m Figure 
3.1 (for a precinct for which P A > -5). 

Figure 3.1 


Gain 

in 

Plurality 



4.1 Our model enables us to determine the consequences of any 
particular allocation of resources to precincts and tactics, and be¬ 
cause of the very simple structure of the model, these conse- 
quences can be readily traced out by hand calculation. Our pur- 
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pose m constructing the model was to use it in order to find the 
est, or optimal, allocation; thus, in principle, we might try to 
enumerate systematically every possible allocation, use the model 
to predict the expected gain in plurality produced by each, and 
finally select that allocation with the greatest gain. Obviously such 
an approach is tedious at best, and furthermore there is no 
assurance that it will ever discover the best allocation, since there 
are infinitely many possibilities to be tried. It would clearly be 
desirable to have an efficient and systematic method for dis¬ 
covering the optimal allocation without the necessity of an exhaus¬ 
tive search of the alternatives. We proceed now to describe such 


4.2 In our overall optimization problem, we must decide how to 
allocate our resources across precincts, and also which tactic shall 
be employed m each precinct. Let us first consider the latter ques¬ 
tion; from inspection of Figure 3.1 it is evident that, in a precinct 
of the type depicted, if the expenditure in the precinct is large then 
the selective tactic will be preferred. The maximum gain possible 
from blind canvassing (when every voter in the precinct is con¬ 
tacted) is given by the maximum possible number of such con¬ 
tacts, N, times the per-contact productivity, p i; thus, using (3 4) 
the maximum gain is 6 v ' 

/>iN=a:(2P A - 1)(1 - P v )PrN. (4 1} 

In selective canvassing the maximum number of contacts is the 
number of registered A-partisans, P A P R N, and the per-contact pro- 
uctivity is p 2 , so from (3.5), the maximum gain is 

— ck (1 Pv)P a P r N. (42) 

Whenever P A < 1, evidently this latter expression is larger. When 
expenditure levels are large enough so that the precinct can be 
saturated, the blind canvass is inferior because it inevitably acti¬ 
vates some opposition voters, whereas a selective canvass does not. 
Conversely, m Figure 3.1 the blind canvass is better at low ex¬ 
penditure levels because of its lower cost per contact. Whether 
this is also true in other precincts depends critically upon the 
relative costs per contact, upon the registration rate (since con¬ 
tacts with unregistered persons, however cheap, are wasted), and 
upon the relative number of opposition voters in the precinct. 

a < .5 then a blind canvass will be counter-productive and 
the selective canvass will be the preferred tactic at all expendi¬ 
ture levels. In general, comparing the respective per-dollar pro- 
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ductivities pj ci and p 2 /c 2 , a blind canvass will be preferred (at 
low levels of expenditure) if and only if 

(2P a -1)Pk>-^- 

for the precinct in question. Let fi (Xi) be the function which 
predicts the plurality gain in precinct i produced by .spending Xi 
dollars on a canvass with the preferred tactic; graphically, fi will 
be the envelope of the gain-cost functions shown in Figure (3.1), 
represented by the dotted line labeled f. 

Now consider the broader question, of how our canvassing effort 
shall be allocated across precincts. Formally, the problem is to 
allocate the available canvassing budget B to the k precincts in 
such a way as to make the total plurality gain F as large as pos¬ 
sible; thus, 

k 

F = 2 fi(Xi) =maximum, 

subject to (4-3) 

1. IXx<B 

2 . x!^0,i=l, ...,k 

This is a familiar constrained-maximization type of problem; how¬ 
ever, the function to be maximized is not sufficiently smooth to 
permit use of the calculus to find the maximum. Other techniques, 
such as linear programming, do deal with piece-wise linear func¬ 
tions, such as we have here; unfortunately, however, in the present 
problem some of the payoff functions fi are not concave. 8 Without 
going into details, this means that (4.3), despite its very element¬ 
ary structure, in fact constitutes a problem in nonlinear program¬ 
ming, and a solution procedure would be complicated. To circum¬ 
vent these difficulties we will modify the problem somewhat, 
making it soluble by a much simpler procedure. The modification 
consists of replacing the true payoff functions h by new, approxi¬ 
mate functions fi', which are concave. In precincts where the selec¬ 
tive tactic is always better, the true function h is already concave, 
so in this case fi' and h are identical. Where the blind tactic is 


8 A concave function is one which, roughly speaking, obeys a law of diminishing 
returns; specifically a function f is concave if and only if the chord which connects 
any two points of the function lies on or below the function between those points. 
On the relevance of concavity to programming, see any standard text, or, e -S-> 
Markowitz and A. S. Manne, "On The Solution of Discrete Programming Problems, 
Econometrica, XXV (1957), 84 ff. 
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better initially, however, f, is not concave, so we replace it by a 

f' wilThp f 1 ’ ^ enCe m tbis case tile concave approximation 

will be related to the true function f as shown in Figure 4.1. 

Figure 4.1 



We can write f' as a weighted sum 
fi'(Xi) = ail y u -f- a i2 y i2 , 

where a u , a i2 are the slopes of the two line segments, and where 

Xi—y« -f y 12 , 

0 < yu < c t N 1 , 

0 < y i2 < c 2 P A * Pk 1 N 1 - Cl N 1 . 

^f^x!) = S a Where the SdeCtive tactic is aIwa ^ best, evidently 
Our modified problem is thus 


— XXay — maximum 
subject to 

1. XXy^B 

2. 0 ^ y ^ Cl N or c 2 P A P R N — Cl N 


(4.4) 


sdvld a bv y rrn defined abo r ™ s modified ?«***** «»* 

c N c d p y p*M f 0110 ^ slm P le First, evaluate a u , a 

of ferLtog site PrednCt: SeC ° Dd ’ "™ n * B * e a ’ s fa ° rd 

aaj)t ^ a ( ij )2 a ( ij) 3 ...; 

and finally for any budget B, invest in the y’s in the same orde 

hstulafteVdT™? and ?“ going on *° 4116 next d 

o£e vt n p g 15 exh f usted - When this procedure is con 
P W1 bave some y« s which have been set to their max 
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mum values, some which are zero, and perhaps one (the last one 
on the list before the budget ran out) which is less than maximal 
but greater than zero. In any precinct, if y» > 0, then we should 
spend y„ + y« dollars on a selective canvass in that precmc 
1 = 0 and y„ > 0, then we should spend y„ dollars on a bhnd 
canvass; and if y„ + y» = 0, no canvassing should be done m that 
precinct (since the money is better spent elsewhere). 

It is straightforward to show that the allocation resulting from 
this procedure is indeed a solution to (4.4); if y* is th ® flrst J “ 
the sequence which has not been made as large as possible, and a 
is its slope, their reallocating X dollars into some of the unused y s 
will increase the gain by sj a*X (since the a for all unused y are 
< a*), while the loss produced by withdrawing these dollars from 
earlier y will be > a*X (since the earlier as are all ^ a ); hence 
no such allocation can increase the objective function, and the 
original allocation is indeed a maximum. 

However (4.4) involves the concave approximations f' rather 
than the true payoff functions f, and it is possible that a solution .to 
(4.4) is still not optimal in terms of the original formulation (4.5). 
Our concave approximations are such that 

f(X) (X), 

for any X. When X=0, or Cl N, or c 2 P A Pk N (or more precisely, 
where the corresponding y s are either zero or as large as P°** b e / 
then f and f are equal. Let us choose our budget B so that this 
the case in every precinct, and let z u ..., z k be the resulting oca- 
tion; then since f(z)=f(z) in each precinct, it follows by sum¬ 
ming over precincts that 

2P(z)=2f(z) t 

Now suppose we reallocate in some fashion, so that the allocation 
in precinct i becomes Zi + di dollars, and where 

2di<0 , /' 

(since we are not to exceed the budget constraint). By the argu¬ 
ment of the last paragraph, no such reallocation can increase the 
concave-approximation payoff; hence 

2F(z+d)<2F(z) . J 4 - > 

However, letting X = z + d in (4.5) and summing evidently 

2f (z-f-d) ^ XP(z-f-d) 

Hence if we combine this with (4.6) and (4.7), we have 
2 f(z+d)<2P(z+d)<2f'(z) =Xf(z), 

_ A„f frvv tfcic nhnire of B. no reallocation can increase the true 
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MoftW*' F ° r 0ther budgCt l6VeIs foUows from (4.8) and 
(49) that the true gam will be less than, or possibly equal to, the 
solution to the modified problem (4.4). ’ 

haV ?, Sh0Wn ’ 4h u en ’ is the following: by reformulating 

f“ blem . We , ob , tained a edified problem which is 

Sizet the Jlf h l a T P ! e °i e c cal P rocedure - For certain budget 
es, the allocation obtained by solving the modified problem is 

optimal with respect to the original problem (4.3) also; for other 

budgets we tend to overestimate the true gain produced by the 

opSTn th IT EVen , S °’ h ° WeVer ’ ^ &***» * near- 

one ofiev? S T ” ? ' y ° ne P" < 0r more P recise Iy. 

V.j) will resources have been committed in a less-than- 

ophmal manner. For practical purposes, then, we have a solution 

procedure for our canvassing model. In section 4.4 we will apply 

the procedure to a hypothetical constituency. PP 7 

tth T1 f algoritllm descri bed above produces canvasses in which 
both mter-precinct resource allocation and intra-precinct tactical 

r ne Ve rthT procedure is simple to 

ppfy, nevertheless it does require more clerical and computa- 

ona! effort than would be needed if a simpler canvassing strategy 
were used A relevant question, therefore, is whether this tvpe of 
formal optimization is worth the extra effort it requires. T^gain 
some insight into this question, let us consider some alternative 
simpler canvassmg strategies. The approach described in the pre- 
ceeding section, in which both inter-precinct resource allocation 

mizatiom 03 ° h ° 1Ce ** ° ptimi2ed ’ we shaI1 refer t0 as “Ml” opti- 

At the °p p osite extreme, the simplest type of canvass is one in 
which ah canvassing is blind, and is conducted upon arbitrarily or 
randomly selected voters. There is reason to believe that a meat 
deal of canvassing in American elections is done blindly and while 

el? T S6nSible ^ 40 -'"ct pr 
at random m the hteral sense, by means of a random number table 

nevertheless it may be that whether a voter is canvassed or not de- 

p nds upon factors which are essentially unrelated to productivity 

ant T° y ,’, SUCh v 3 4be avaU ability of volunteers or a block 
captain locally, or the state of the organization in the precinct If 

s is so then taking the voters to be chosen at random is a rea 
sonable if rough representation. We refer to this type of operation 
as blind-random canvassing. The per-dollar productivity^ such 
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a canvass is a weighted average of the blind-canvass productivities 
in each precinct, the weights being the relative sizes of the pre- 

cincts • 

A more complicated but presumably more efficient mode of 
operation is where all canvassing is done blindly, but the precincts 
to be canvassed are chosen optimally. The Democratic canvass 
conducted in Los Angeles County in the 1962 California guberna¬ 
torial campaign may have approximated this pattern To choose 
optimally we compute the blind canvass productivities pi/Ci or 
each and arrange the precincts in that order until the budget is ex¬ 
hausted, or until the productivities become negative, whichever 
occurs first. We shall refer to this as a ‘blind-optimal canvass. 

Still another mode of operation is always to canvass selectively, 
but in precincts chosen at random. The average productivity is a 
weighted average of each of the selective-canvass productivities, 
the weights being the fraction of all registered A-partisans be¬ 
longing to each precinct. We refer to this type of operation as a 
“selective-random” canvass. 

Finally, we have a “selective-optimal” canvass, in which we 
canvass selectively in the most productive precincts. To select the 
most productive precinct, we rank them according to their selec¬ 
tive-canvass productivities pj c 2 and then invest in the precincts m 
that order, until the budget is exhausted. 

To recapitulate, in planning a canvass we must decide which 
precincts to canvass, and which tactic to use in those precincts. 
If the blind tactic is to be used everywhere, then we have either 
a blind-optimal or a blind-random canvass according to whether 
we attempt to choose the most productive precincts or not; it the 
selective tactic is to be used throughout, then we have either a 
selective-optimal or selective-random canvass, again depending 
upon whether the choice of precincts is optimized or not. Final y, 
full optimization resembles blind or selective optimization in at¬ 
tempting to optimize the choice of precincts, but it differs from 
both in that it does not require the same tactic to be used every¬ 
where. 

4.4 In order to compare these modes of operation we will apply 
each to the constituency described in Table 4.1. The data are 

9 See Helen Fuller, "The Man to See in California,” Harper’s Magazine, CCXXVI 
(January, 1963), 64 ff. 
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imaginary; however, they were chosen so as to present a plausible 

cS)° o P ast Ct W? alSO (by SeWn S P * > - 5 * P^ 

cincts) so as to make blind canvassing reasonably productive 

T>_• . . _ 


Precinct 
1 
2 

3 

4 

5 

6 

7 

8 
9 

10 
11 
12 


Pa 

.9 

.9 

.9 

.9 

.7 

.7 

.7 

.7 

.4 

.4 

.4 

.4 


P H 

.9 

.9 

.7 

.7 

.9 

.9 

.7 

.7 

.9 

.9 

.7 

.7 

Table 4.1 


Pv 

.8 

.6 

.8 

.6 

.8 

.6 

.8 

.6 

.8 

.6 

.8 

.6 


N 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

1000 


We assume that blind canvassing costs ten cents per contact 
(which is probably a realistic, though rough, figure), and that 
selective canvassing costs twenty cents per contact (which is a 

. g T ) rfW y alS ° rithm * is neceLy to 

n,a i2 of the two line segments of the concave approximations 

GoT sltf e y ressions (4.2) we can calculate die gains G,' 

G. of sateahng any precinct with each tactic, and similarly we can 
calculate the costs C„ G of doing so. These quantities LTabu 
lated m columns (2) to (5) of Table 4.2. The productivity of the 

blind tactic is then |i = £ ; of the selective tactic, £=£ ; and 

of the transition, from blind to selective saturation,These 
quantities are tabulated in columns (6), (7) and (8) ofTable 4.2. 

prSn^fs :T 0 7° B fUn d i0n ° f 

columns (6) and (7) of thetabled dj ^tog« “^Mnd" 
ranvass productivity pj c„ there is a second slope a„, given in 
column 8 To obtain an efficient canvass we invest to safuration 
m order of decreasing productivity (i.e., decreasing a„) The most 

dS-rabrd^ <Wlth 3 Pr ° dUCtiVity 

dollar) is a blind canvass in precinct 2; thus we first allocate $100 
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to saturate that precinct with a bhnd canvass. We then locate 
$100 to the second most productive opportunity (.9 votes/doUar) 
a blind canvass in precinct 4 ; then $126 to saturate precmct 6 (or 
8 10 or 12, which are equally productive) with a selective c 
vass,and so on until the budget is exhausted. If the budget is 
large enough, the final allocation will be to the least productive 
opportunity, a switch from blind to selective saturation of pre¬ 
cinct 1, which costs $62 and produces only .12 votes per dol ax 
spent, for a total of 7.4 additional votes. To determine the overall 
„ai n we sum the gains produced by each expenditure; thus a bud¬ 
get of $100 produces 115 votes, $200 produces 205 votes, $326 pro- 
duces 306 votes, and so on. 


Blind Selective J±i Jfi ^^ 

G Ci G 2 C 2 Ci C 2 C 2 l>i 

Precinct (Votes) ($) (Votes) ($) (V/$) (V/$) (V/$) 

(1) (2) (3) (4) (5) (6) (7) (8) 

1 58 100 65 162 .58=an .40 .12-ai 2 

2 115 100 130 162 1.15=a 2 i .80 .23-a 22 

3 45 100 50 126 .45=a 31 .40 .22-a 32 

4 90 100 101 126 .90=an .80 •43-a 42 

5 29 100 50 126 .29 ,40-a 5 i — 

6 58 100 101 126 .58 .80=a 6 i — 

7 22 100 39 98 .22 .40-a 7 i — 

8 45 100 78 98 .45 .80=a 8 i — 

9 —15 100 29 72 —.15 .40-a»i — 

10 —30 100 58 72 —.30 .80=a 10 ,i — 

11 —11 100 22 56 —.11 .40 — au,i — 

12 -22 100 45 56 —.22 .80 = a Uj i — 

Table 4.2 

In the bhnd-optimal type of canvass, we optimize only with 
resnect to the blind-canvass productivities m column (6), tnus 
wXestin precincts 2 (115 4tes), 4 ( 205 votes), 6 votes, 
etc The final allocation, at an expenditure level of $800 is to 
precinct 7; even if the budget is larger we never allocate funds 
to the remaning precincts 9 through 12, since the results would be 
counterproductive. A selective-optimal canvass is handled simi- 
larly, using the selective-canvass productivities m column (t). 

To obtain the per-dollar productivities for blind-random and 
c a! pptive-random canvasses we average the productivities m col- 


Gi 

G 2 

G 2 -Gi 

a 

C 2 

C 2 -Ci 

(V/$) 

(V/$) 

(V/$) 

(6) 

(7) 

(8) 

.58=an 

.40 

.12 — ai 2 

1.15=a 21 

.80 

.23=a 22 

.45=a 3 i 

.40 

.22=a 32 

.90=841 

.80 

.43=a 42 

.29 

.40 = 

a 5 i 

.58 

.80= 

a 6 i 

.22 

.40= 

a 7i 

.45 

.80= 

: a 8 i 

—.15 

.40= 

: agi 

—.30 

.80= 

: aio,i 

-.11 

.40= 

: an,i 

—.22 

.80 = 

: ai 2 ,i 
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(7) i ^ results are - 32 votes p er doUar ( for bud - 

g s ^ $1200) and .60 votes per dollar (for budgets s; $1280) 
respectively. 

To get a general picture of how these modes of operation com- 
pare in efficiency we have plotted in Figure 4.2 the expected 
plurality gain produced by each, for budgets up to $1280. 10 


Figure 4.2 
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versely, the blind-random canvass is uniformly the least efficient. 

If for some reason only the blind tactic is possible, then clearly 
optimization (the B-0 mode) is worthwhile. Except at very arge 
budgets the same is also true of the selective tactic; as the budget 
approaches $1,280, however, the S-0 and S-R modes (and the F- 
mode as well) become identical, since in all cases the recom¬ 
mended field activity-a selective canvass of the entire con¬ 
stituency-^ the same. At small budget levels, blind optimization 
is superior to either type of selective canvass, and it is better than 
selective-random canvassing even at fairly large levels. However, 
this is in part an artifact of our example; in constituencies not so 
overwhelmingly pro-A in partisanship, B-0 canvassing could be 
inferior to S-0 or even S-R canvassing at every budget level. In 
such constituencies the relative advantage of F-0 over S-0 would 
also decrease, since use of the blind tactic-which is the only dif¬ 
ference between F-0 and S-O-would be less attractive. The mar¬ 
gin between S-0 and S-R would remain, since it depends basically 
on the heterogeneity of the constituency, rather than its partisan¬ 
ship; and the advantage of B-0 over B-R would grow, since in a 
balanced constituency blind-random canvassing would be unpro¬ 
ductive or even counterproductive. 

5.1 The analysis of the preceding sections has been based on a 
specific model of canvassing. Like any model, ours is a drastically 
simplified representation of a very complex and uncertain process. 
In empirical research of this kind, the investigator is always faced 
with a fundamental choice: whether to adopt a simple and there¬ 
fore useful model, at the risk of being too simple and hence un¬ 
realistic; or whether, on the other hand, to employ a more com¬ 
plicated but more realistic model, which may prove to be too com¬ 
plex to be of much practical use. In the present study we chose 
a simple model, and as a result the analysis has been relatively 
straightforward. Before accepting it, however, it is important to 
consider the extent to which our conclusions are sensitive to pos¬ 
sible violations of the assumptions on which the analysis is based. 
For example, we have assumed that certain information is avail¬ 
able for planning purposes, and that this information is perfectly 
reliable; but what if it is not available, or not reliable. Also, a 
peculiarity of the campaign problem (which it shares with many 
military problems) is that it is a game situation, in which what¬ 
ever strategy we select will be confronted by our opponents 
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counterstrategy. What happens to our analysis if, for example, our 
opponent conducts a canvass of his own? Let us briefly consider 
some of these possibilities. 

5.2 We have so far assumed that we are able to make accurate 
forecasts of turnout levels, of the party’s share of the vote in each 
precinct, and of the partisanship of individual voters. On the basis 
of these forecasts we can predict the effects of alternative alloca¬ 
tions of our canvassing resources, and can therefore identify that 
allocation which is best. How sensitive are our conclusions to 
inaccuracies in these forecasts? 

First consider the turnout predictions. Clearly certain combina¬ 
tions of errors could throw our calculations seriously awry; for 
example, if errors were concentrated in those precincts for which 
one tactic was best, then the effects and the relative efficiencies 
of the various modes of operation could be greatly altered. Such 
malevolent errors are always possible, but other types of error are 
more likely and are therefore of more interest. Forecast errors 
might be randomly and independently distributed across all pre¬ 
cincts; provided they are not too large, the analysis is not sub¬ 
stantially affected. A more important type of turnout-forecast error 
is where all precincts are affected comparably; for example, good 
or bad weather might cause all turnout rates to be generally above 
or below the forecasts. Suppose that the actual nonvoting rates 
1 — Pv' are ft times the predicted rates (where ft > 1 for bad 
weather, ft < 1 for good weather). By inspection of (3.4) and 
(3.5), it is clear that using either tactic will yield ft times the 
predicted gain, and thus their relative (but not absolute) efficien¬ 
cies are unaffected. 

Now consider the partisanship forecasts; let us suppose that our 
information is unreliable, in the sense that only a certain fraction 
ft < 1 o£ the voters actually do vote as predicted, the remainder 
voting for the opposition. If P A ' is the forecast value, then the 
actual value of P A , taking account of defection from and to party 
A, is evidently r J 

P A =/3P/-f (1-/3) (1-P/) 

=P/(2/3 — 1) -j- (1 — ft) 

The blind-canvass productivity (3.3) contains a (2P A —1) term 
which becomes, in terms of the forecast value, 

2[P a '( 2/3-1) +1-/3] -l = 2P A '(2/3-—1) +2-2/3 — 1 

= (2P a '-1) (2/3-1) 






158 


MATHEMATICAL APPLICATIONS 


Hence the true blind-canvass will be (2/3 — 1) times the forecast 
value. In a selective canvass the defection from A causes it in 
effect to become a blind canvass with productivity 

a( 1-Pv)(2/3-1), . . 

which again is (2/3 — 1) times the forecast productivity (3.5). 
Thus both tactics are again affected identically, and their relative 
efficiencies are unaffected. 

A more pessimistic assumption (for A) is to suppose that de¬ 
fection takes place exclusively from A to B, and not in the op¬ 
posite direction. If only /3 of the forecast A-voters actually do vote 
for A, then the selective canvass productivity is again as in (5.1); 
the blind-canvass productivity, on the other hand, becomes 

(2/3Pa , -1)(1-Pv)Pr. 

Thus the tactics are affected differently. To see what this means 
for our various canvassing modes, let us consider $600 canvasses 
of each type; the forecast and actual effects of each type of 
canvass, conducted in the constituency described earlier, are tabu¬ 
lated in Table 5.1 for /3 — .8. 


Type of 
Canvass 

F-0 

S-0 

B-0 

S-R 

B-R 


% Defection: 


0% 

20% 

513 votes 

286 votes 

480 

288 

410 

200 

360 

216 

192 

62 


Table 5.1 


All modes are adversely affected by the defection, but the blind 
canvasses B-0, B-R are most seriously hurt. Conversely, had defec¬ 
tion from B to A occurred, these modes would have taken greater 
advantage of the fact.) Although the selective tactic requires 
more detailed information than does the blind-canvassing tactic, 
nevertheless the modes which use this tactic (S-0, S-R) are not 
more sensitive, and in Table 5.1 are less sensitive, to various lands 
of error in the information. However, these modes are affected 
seriously if the required information is simply unavailable. If only 
some of the supporters of A are individually known to the party, 
then the various modes will be affected as shown in Table 5.2 
(again for budgets of $600). 
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Type of 


% of Supporters Known: 


Canvass 

100% 

75% 

50% 

S-0 

480 votes 

432 votes 

368 votes 

B-0 

410 

410 

410 

S-R 

360 

360 

360 

B-R 

192 

192 

Table 5.2 

192 


For the range of contingencies considered on the table, only 
the S-0 mode is affected; when fewer than 66% of the A-voters are; 
known, the B-0 mode is better. If only 25% were known, then the 
S-R mode could produce no more than 192 votes, the same as the 
B-R gain; however, the cost would be much less, since a budget 
of $320 would then suffice to contact all known A-supporters. Full; 
optimization, though not tabulated in the table, would be superior 
throughout, since the algorithm (with minor modification) takes 
account of information constraints by making more use of the 
blind tactic where necessary; in the limiting case where no sup¬ 
porters are known, full optimization becomes identical to blind 
optimization. 

Another assumption which has been implicit throughout is that 
the opposition does not conduct a canvass of his own. Let us 
very briefly consider the consequences of relaxing this assumption. 
First suppose that the opposition contacts /3 of all voters, at 
random. Then evidently /3 of A’s contacts are in effect wasted, 
since those voters have already been (or will be) contacted, and 
by our assumptions a second contact has no additional effect. Thus 
all tactics and all modes are affected similarly, and the relative 
efficiencies are unchanged. 

The effects of an opposition selective canvass (at random) are 
more interesting. Such opposition activity will not affect a selec¬ 
tive canvass by party A, since both parties contact only their own 
known supporters. If A conducts a blind canvass, then none of 
the contacts with A-supporters are wasted; on the other hand fi 
of the contacts with B-voters are. Thus the blind canvass inspires 
fewer additional B-voters to vote, and therefore, oddly, becomes 
more productive; the actual per-contact productivity is 

a[P A - (1-/3) (1 — P A )] (1 — P v ) P R 
When /3 approaches unity (i.e., when the opposition contacts all 
the B-voters), then a blind A-canvass acts almost like a selective 
canvass, except that some contacts are still wasted on unregistered 
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voters. Table 5.3 shows the effects of opposition selective can- 


vassing upon 

Type of 

the different kinds of $600 A-canvasses. 

% of B-voters contacted by opposition: 

100% 

Canvass 

0% 

50% 

F-0 

513 votes 

528 votes 

545 votes 

S-0 

480 

480 

480 

B-0 

410 

467 

529 

S-R 

360 

360 

360 

B-R 

192 

288 

384 


Table 5.3 


Even when the opposition canvasses half of its supporters the 
efficiency ranking is unchanged; however, with a 100% canvass, 
B-0 surpasses S-0 and B-R is better than S-R. Full optimization 
remains the most efficient mode throughout. 

6.1 In the preceding three sections, we have suggested a simple 
and general method for planning a political canvass, which seems 
to offer advantages over various simpler approaches to the prob¬ 
lem, and whose superiority in this respect does not seem to be 
highly sensitive to the specific assumptions on which our analysis 
was based. Whatever the practical relevance of these findings, the 
same method should be equally applicable to the very similar 
problems of telephone and mail canvassing. The same general 
approach, though with differences in detail, could be used to 
analyze the problems of planning a precampaign, partisan registra¬ 
tion drive. 

Clearly there are other campaign activities—television activities, 
for example—which are of a wholly different order of complexity. 
Even there, however, there is reason to hope that systematic 
quantitative analysis may become feasible in the not too distant 
future; operational research on marketing problems, for example, 
may lead to results of direct relevance to television campaigning. 11 
In any event, the use of quantitative methods for policy analysis 
has proved to be fruitful in many different fields, and these 
methods deserve to be more widely known, and used, in political 
science. 


11 See, for example, J. D. Herniter and R. A. Howard, "Stochastic Marketing 
Models,” chap, iii of Hertz and Eddison, op. tit., pp. 33 ff. 
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4. Models of the Political System 

Introductory Note 

We are entering upon an age of reconstruction, in religion, in 
science, and in political thought. Such ages, if they are to avoid 
mere ignorant oscillation between extremes, must seek truth in its 
ultimate depths. There can be no vision of this depth of truth apart 
from a philosophy which takes full account of those ultimate ab¬ 
stractions, whose interconnections it is the business of mathematics 
to explore. 

Alfred North Whitehead 

Although Plato believed in mathematics as the key to ultimate 
philosophical truth, he used verbal means to express his models! 
of the political system. The mode of expression remained verbal 
for twenty-three hundred years, and only recently have scholars 
begun to convert to mathematical expression. The process of con¬ 
version has contributed a new rigor to political models. When the 
scholar attempts to translate his ideas into the language of mathe¬ 
matics, verbal ambiguities are discovered and must be ehminated. 
Vague ideas must be clarified and reduced to precision, if they are 
to be expressed in mathematical symbols. When this occurs, the 
scholar better understands his subject. 1 

Whether verbal or mathematical, model-building requires sim¬ 
plifying assumptions because all variables cannot be identified 
or controlled. This necessity of simplification is found in models of 
the physical, as well as the social, sciences. Any doubt that this 
is true is quickly dispelled when one contemplates what has hap¬ 
pened to the Newtonian models in twentieth-century physics. In 
physics, as in other realms, mathematical models are merely ab- 
stractions, designed to approximate the real world. Despite their 
imperfections they have been useful to technology, as well as to 
science. Social scientists have the additional problem of the human 
psyche, which gives rise to a considerable variety of behaviors. 
Political scientists have, however, a source of comfort. The psychic 
problem has not caused psychiatrists and sociologists to despair, 
although their disciplines are, in some ways, less amenable to 
precise conceptualization than is political science. 

1 Otto A. Davis, "Final Critique of the Conference on Mathematical Application in 
Political Science,” Southern Methodist University, Dallas, August 6, 1965. 
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Simplifying assumptions have been well-recognized limitations 
on scientific model-building. In his perceptive coupling of some 
modem political theories with some classical theories, William T. 
Bluhm describes Anthony Downs 2 and William H. Riker as 
“strategy theorists,” and he compares their method with that of 
Hobbes. He quotes Chapter VII of The Leviathan: 

No discourse whatsoever can end in absolute knowledge of fact, past 
or to come. For as for the knowledge of fact, it is originally sense, and 
ever after memory. And for the knowledge of consequence, which I have 
said before is called science, it is not absolute but conditional. No man 
can know by discourse that this or that is, has been, or will be, which is 
to know absolutely, but only that if this be, that is; if this has been, that 
has been; if this shall be, that shall be—which is to know conditionally, 
and that not the consequence of one thing to another, but of one name 
of a thing to another name of the same thing. 

Bluhm observes that even though the [conditional] knowledge 
we have always remains knowledge of an abstract world, not a 
real one, 

... if we are good at fitting the right “general names” to the par¬ 
ticular “fancies” that inhibit our psyches, and if the rules we establish 
correspond to empirical laws, our scientific knowledge provides us with 
a powerful instrument of prediction and control over the world of 
sensible particulars. We can interpret the real world in the light of the 
model, and thus establish power over it. . . . 

[Hobbes asserts] that the theoretical reason is not a device for under¬ 
standing and contemplating eternal objects, but an instrument for 
manipulating the world of sense, because the world of sense has a logi¬ 
cal structure to it, susceptible of being known under the categories of a 
model world of “general names.” 3 

Riker s article on the size principle is a postscript to his im¬ 
portant volume. The Theory of Political Coalitions, in which the 
size principle is the central idea. Riker has undertaken no less than 
the “creation of a theoretical construct that is a somewhat simpli¬ 
fied version of what the real world [of political coalitions] ... is 
believed to be like.” He suggests that propositions from his model 
can be validated or refuted empirically, and the model in conse¬ 
quence can be perfected or abandoned. 


2 An Economic Theory of Democracy (New York: Harper & Row, 1 957). 

3 Theories of the Political System (Englewood Cliffs, N.J.: Prentice-Hall: 1965), 
pp. 267-270. 
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Game theory provides the conceptual basis for the model The 
^alrfaon theory includes these notions: an “n-person” g Tme 'mSe 

rttionXTtCT 0 ’ ‘Tt ”r: ( § ains P recisel y e|ual lisses)? 

“d^s^r “ fou °4 

1^^™" *« «— of action 

these ^ Par ‘^ PantS 

0110056 lea’dinglot^^^^ Wffl 

that™”Z‘t t lv de Pay T n ‘ S ” ^ P 6rmitted ’ concludes 
rZes^TZ “ aIltl0ns toward the absolute minimum size 
necessary for success. He also posits that a long-range result of 

S2VV ^ SySt f"’ Which Mudes SesTlm! 

actensttcs, is the elimination of participants. Consequently dis 
equilibrium rather than a ‘balance of power” occurs 7 

Riker’s work has been criticized on the ground that its simnli 
fymg assumptions, particularly the zero-sum assumption make ft 
inapplicable to real world politics. The author, however antici 
pa es this stricture with a variety of historical evidence to buttress 

“*■>”«-■>»« 

z~ * •* 

wars, which are perceived JZ ? , discussin g elections and 

model is probably best. . . , 5 ^ ™ ,visi ^ e vlct oiy, the zero-sqm 

The article by Otto A. Davis and Melvin Hinich belongs to the 
same genre as the works of Hiker and of Anthony DoZ Thlre 

p ™?‘ l C °‘ Uth " S (N " H.™. ^ Yale U„ iTOsit7 

5 Ibid., p. 31 . 
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supports the candidate whose position on the ^“^4 

likely to maximize his (the voters) uti lty. an ’ model 

will adopt the policies announced poor to election. The mode 

assumes that poheies are measured by fd^onTtoSdiaf 

voters use the same indices. Davis and Himch demonstrate mat 

given these assumptions, including the known dis ^ tl ™ ° 
voters on a continuum, a median strategy normally wins over 

““Th™ mtho« nef consider the problem of a party nomination of 
a —g candidate. If a purely democratic nominating F°c~ 
is employed and if all voters in the system are members of a party 
in a two-party system, disequilibrium and resort to violence may 
occur, when^mLrity whose desires differ widely from the views 

of the majority are denied any chance of influen ,°“ S . ’^referred 
If the policy position of a minority party candidate isjireferre 

by enough members of the majority party to constitute a slight 
m y ajority S of all voters, the candidate of the minority party may 
IX the model, in its application to the nommati“ 
lem suggests that chances of victory m the genera election m y 
beimpnwed by selection of a candidate whose position is a com¬ 
promise between the desires of his own party members and the 

members of the other party. Hpmooratic 

The analysis reveals a dilemma of nommattomt The democratic 

method of choosing a nominee permits rational party vo ers 
seek maximization of their utilities by choosing a candidate whose 
p?l”n is harmonious with their own. The dictates of general 
election strategy, on the other hand, requires a compromse can¬ 
didate who can appeal to some members of the opposition party. 
This kind of nomination may be achieved by abandoning 
mocracy in favor of a “smoke-filled room” choice. The latter en¬ 
ables the party to choose a candidate whose position is mo 
- compatible with the entire population of voters (in both parties), 
although it may be less preferred by the subset of voters m the 

candidates own party. it 

The reader will note that the Davis-Hiruch model altbo “S b 
analyzes “conditional” knowledge, describes and* expbuns; mathe¬ 
matically a number of observable uniformities which are found in 
party systems of the real world. 



A New Proof of the Size Principle 

WILLIAM H. HIKER 

University of Rochester 

In The Theory of Political Coalitions I presented a proof of the 
size principle, which is an adaption to the world of real coalitions 
of the following inference from the theory of n -person games: 

In n-person, zero-sum games, where side payments are permitted, where 
players are rational, and where they have perfect information, only mini¬ 
mum winning coalitions occur. 

The proof of this inference was, however, somewhat involved, so 
I take the opportunity of this paper to present a simpler and more 
easily understandable direct proof. 


As a preliminary step, let me recall for the reader some of the 
main notions of n-person game theory as set forth by Von Neu¬ 
mann and Morgenstem (2). 

In two-person, zero-sum games, the problem faced by each 
p ayer is the selection of a strategy (i.e., a complete set of choices 
for each possible move) such that the player receives an amount, 
v which is the most he can unilaterally guarantee himself and 
the least his opponent can unilaterally hold him down to. In 
n-person games, however, the problem faced by each player, at 
least m all games where any kind of co-operation is permitted, is 
less a selection of strategy and more the selection of partners. Pre¬ 
sumably two persons co-operating can sometimes accomplish more 
t an both can acting individually. Hence the main action in n- 
person games is the formation of coalitions. Even though in the 
n-person case the problem of play is different from the problem 
of play m the two-person case, it is still possible to retain the 
notion of a value, v, which is the most that can be unilaterally 
guaranteed. Suppose a coalition, S, forms. Then the worst thing 
that can happen to it is that its complement, — S, forms. (That 
is, its complement, - S, can presumably give S more effective 
opposition Rian can smaller coalitions, P, Q, and R, where 
r U () U R- -S.) If — S forms, we have something like a two- 
person game between S and - S and hence can speak of a value 
or S, v(S), which is called a characteristic function, and which 
is the amount S can guarantee itself regardless of what — S does 
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and also the amount — S can hold S down to. The characteristic 
function is a real valued set function with the following proper¬ 
ties: 

(1) v(4>) =0, where <\> is the empty set. (Presumably an empty 

coalition is valueless.) 

(2) v(S) = — v (— S), which is the zero-sum condition. 

(3) v(In) =0, where I n is the identity subset of the set, N, of 

players, that is, a coalition of the whole. (This property 
is an inference from (1) and (2).) 

(4) v(S U T)^v(S)+v(T), where S and T are disjoint subsets 

of N. When only the equality relation holds, the game is 
said to be inessential (for there is no point to making 
coalitions). Otherwise, the game is essential In the sub¬ 
sequent discussion we will be concerned only with essen¬ 
tial games. 

An example of a characteristic function is: 



members, v(S) — 


f 0 1 

-20 
-40 

‘ 40 * 

20 

l 0 J 


In order to render characteristic functions in a form that allows 
easy comparison among games, it is customary to normalize them 
by letting the coalition of a single player be worth a given mini¬ 
mum, say, — y. That is. 


Setting — <y = — 1, we have the following normalized form for the 
foregoing example: 


If S has 


f 0 
1 
2 

: 3 
4 
. 5 


members, v(S) — 



Characteristic functions do not, however, completely describe 
an n-person game. What counts for the individual player is not 
just the value of the coalition, however much that may be, but 
rather what portion of the value he personally receives. It is con¬ 
ceivable that player i, whose individual receipts are denoted by 
the symbol ‘V’ may prefer a coalition S 2 to a coalition Si, where 
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v(S 1 )>v(S 2 ), if SLi > a j. One must, therefore, describe not only 
i € S 2 j € Si 

the payoff to coalitions but also the payoff to individuals, which 
latter are called imputations: An imputation is an n-tuple of real 
numbers, a— (ai, a 2 , . . . , a n ), which satisfies the following con¬ 
ditions: | 

(6) ai^v([i]), which asserts that no player will accept in any 
coalition an amount less than he can obtain in a coalitioni 
of himself alone; and 


(7) ^ai=0, which is not only the zero-sum condition, but also 

i~l ' 

asserts that rational players, whatever their coalition 
structure, will obtain the full value of the game. 


II 

The task of n-person theory is to place some limitations on both 
characteristic functions and imputations in order to render the out¬ 
comes predictable. Von Neumann and Morgenstem initiated thiis 
process with a discussion of the range of characteristic functions. 
Specifically, they showed: 

(8) if S has 0 members, v(S)=0. 

(9) if S has 1 member, v(S)= — y. 

(10) if S has (n— 1) members, v(S)=y. 

(11) if S has n members, v(S) =0. 

(12) if S has p members, where (n — 

then — py^ v(S)^(n — p)y. 

Graphically these results can be shown thus: 

Figure 1 


(from(l)) 

(from (5)) 

(from (2) and (5)) 
(from (3)) 

2 ), 

(from (6)) 
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where points (0,0), (1, — y), ((n — l),y), and (n, 0) represent 
assertions (8), (9), (10), and (11) respectively and where the 
vertical lines represent assertion (12). Since Von Neumann and 
Morgenstem did not wish to use the notion of a majority (because 
they wished to allow for weighted players who, though fewer than 
a numerical majority, might win and because they wished to 
allow for discriminatory solutions in which some players were 
guaranteed minimum gains and losses), they could not narrow the 
range further. One can, however, use the notions either of a ma¬ 
jority of equally weighted persons or of a majority of equal units 
of weight, thereby preserving the feature of weights while per¬ 
mitting much further narrowing of the range of characteristic 
functions. (Here we will be concerned only with majorities of 
equally weighted persons; but for a presentation of the majority 
notion in terms of units of weight, see reference (1), pp. 253-61.) 
In so doing, we are, of course, limiting ourselves to nondiscrimina- 
tory solutions for, if the notion of a majority is used, discrimination 
can appear only as unequal weighting. 

Let m be the minimal value of a majority, where 


(13) 


(n + 1) 

-- 1 —-or 

2 



^m<n. (Note that the right in¬ 


equation is written “m<n” rather than as is 

often customary. If m—n, there is nothing to do in the 
game except form the single coalition, I„, of all players, 
which fact renders characteristic function theory trivial.) 

The following definitions can now be offered 

(14) if p>n —p and p^m, then S p e W, where W is the set of all 
winning coalitions; if S p e W, then v(S)^0; if p^=n, then, 
for S e W, v(S)>0, which follows from (4) since we have 
assumed the game is essential. 

(15) iip—m, then S p eW m , where W m is the set of minimal win¬ 
ning coalitions such that S p —1 $ W m . 

(16) if (n—p)^p<m, S p e B, where B is the set of blocking coali¬ 
tions; and v(S)—0. 

(17) if S i W and S i B, then S e L, where L is the set of losing 
coalitions; v(S)^0; if p>0, then v(S)<0, which follows 
from (2) and (14). With these definitions it is possible to 
rewrite (12), narrowing the range of v(s): 

(18) if S e W, then 0^v(S)^(n— p)y. (from (12) and (14)) 

(19) if S e B, then v(S) =0. (from (16)) 
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(20) if S e L, then —py<v(S) <0. (from (12) and (17)) 

Ignoring the possibility of blocking coalitions, the results can be 
shown graphically thus: 

Figure 2 



Even though the range of characteristic functions has thus been 
narrowed largely by eliminating discriminatory solutions, we still 
know relatively little about what coalitions might occur and about 
what imputations might be associated with them. In this sec¬ 
tion, I shall set forth another kind of restriction on coalitions which 
permits a prediction about the range of occurrenes and which is 
sometimes useful in political analysis. 

We can assume, of course, that, since the game is essential, some 
S, S e W, and some —S, —S eL, occur (ignoring here the possi¬ 
bility of blocking coalitions). If, however, there exists some S ? , 
S q e W, and some imputation, a, associated with S a , such that £ 
can guarantee its members more than they might receive in a 
smaller coalition and at least as much as they might receive in a 
larger one, then they would prefer coalitions of size q to all others. 
Such coalitions, S a , are realizable, while all others S p , where p^q, 
are unrealizable. Presumably, once a coalition reaches a realizable 
size, it is relatively stable, except of course for internal squabbles 
over the division of v(S) into ai. 

i € S 

The intuitive idea in the notion of realizable coalitions is that, 
in the set W, there is a subset of realizable coalitions, W 9 , such 
that any coalition in W 9 is preferred to any coalition not in it be- 
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cause in S, S e W", the amounts that S can unilaterally (that is, 
without the co-operation of —S) guarantee its members individu¬ 
ally are at a maximum and, for that maximum, the costs of organi¬ 
zation are minimal. 

Stating this notion formally: For S p , S Q and S r , where p < q 
< r, S q is realizable if, for S q e W, it is possible that 

(I) af p < a.f Q ; and 

1 (II) a?', 

where the notation “af x ” means “the payment to i when i is a 
member of S x ” and x = {p,q,r}. 

The theorem to be proved is: W Q = W m . That is, only minimal 
winning coalitions are realizable. In the proof, I shall show, 
first, that S, SeW m fulfill condition (I), second, that they fulfill 
condition (II), and third, that they alone of all S, SeW, fulfill 
both conditions. 

First, let there be two sets S p and S q , where p < m, q = m, 
and S p is a proper subset of S q . Here it is always true, by reason 
of (14), (17), (18), and (20), that v(S p ) < v(S Q ),for v(S p ) is a nega¬ 
tive number or zero while, when q = m, v(S Q ) is a positive one. 
Hence the amount [v(S Q )—v(S p )] can always be divided 
among the i, i e S p and S q , and the j,j e S q ,j 4 S p , in such a way 
as to guarantee that af« > 0 and a f p < af 9 • (That is, S p , by turning 
itself into a minimal winning coalition, S g , can increase its 
value sufficiently to pay all its old members more than they 
receive in S p and to pay its new members something for join¬ 
ing.) Hence S, S € W m , satisfies Condition I of being realizable. 

Second, let there be two sets S q and S r , where m = q < r, and 
where S q is a proper subset of S r . Here it is possible that 
v(S Q ) | v (S r ). So it is necessary to prove that S, S e W m , meets 
condition (II) in each of three cases: 

Case 1. v(S Q ) > v(S r ). Since v(S Q ) > v(S r ), it is always possi¬ 
ble to form S„ in such a way that, for ieS q and S r , 
a s„ = a sr + <],. 2 + d, - [v(S„) — v(S r )] 

h(SQ 

where 2 d, = 1 and d f > 0. (That is, by reducing from S r , the 

i e Sq 

members of S„ can keep what they get in S r , divide up the pay¬ 
ments made in S r to the people ejected when S g was formed, 
and divide up the increase in value.) Thus it is possible that 
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AS in CaSe " iS P0SSib,e t0 fo ™ S « » 

a f Q =af'- + d^2 a s h r . 

H enc e , as m Case 1, it impossible that a?*> a?'. 

Case 3. v(S„) < v(S r ). In this case, there are thr^ 6 k 
according to the size of the sun, of the payments Vh hTs 
h *S q . It may be that^af' |v(S r ) - V (S.). 

Case 3.1 > v (S r )~v(S„). Since [ v ( S j + 2 a s r ] > v(g ^ 

it is possible to form S, in such a way that X’ 

For example, if af r = d (Vi'S )) j . s ' r a ' ’ * e S«andS r . 

P . i*. a 1 (v(b r )),thenleta ( s «=d l [v(S„)+ 2 afr]. 

Case 3.2 = v(S r ) - v(S,). By the condition of thLcase, 

that MS^+^afr] = v(S r ), it is possible that a, be chosen so 
that at least af« = a? r . 

Case 3.3 .^< y(S r )- v( S a ). Let * , v(Sr) _ v(Sg) _ 2 

then b is the amount that nil i • i „ " (Sq 

the coalition S, to S r . Let c = v(S )~ V ?S ) Th 3 '" from enlar g'ng 
that —S, can afford to offer hht S ,,) ,J^ enc 15 th e amount 

#?s a * r > 0> then c> b. If 2 at’- Q th \l * 

*/ v ' mV" ~ °> then c = b. We can expect 

trom assertion (7) that—S will r i 

a zero-sum game.) Hence a?* - aJh- T * to be ex P ect ed in 

if s r is fornfed. it is not realisable b^lLT's* '* 

pnate bidding force the formation of S S 7 w-» t, ' V ^if™' 

are the following discernible entities: " ’ W ' hen there 

(1) S Q ; (2) those members of —S for whi'ok c • 
number; and (3) those members of~<? f ? u * S * negatlye 
number or zero. menibers for which a, is a positive 

Cases 1 and 2 i s «> a Sr r! * a< and in sub-case 3.1 and 
fulfilled by SSe W- • COnd, “ 0n (II) ’ that a ?» a is 

Third, it remains to show that only S S € W” fnlfin k a. 

conditions. Since in cases 1 2 q i ? , tuIfi11 both 
cases l, 2, and 3.1 of the proof that S 
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S e W m fulfilled condition (II), it was shown that could sat ‘®^ 
the inequation and since it had ^ 

lished that af > # , it follows that ,n these«««, onlyS, S.^ 
satisfies both conditions. But in cases 3.2 and . h . 

i n s « - a Sr Since it is thus possible that, tor some cnoice 

fVr To ' both S r and S. are possible candidates 

for fulfUlL condition (II). But since a?' = af , where q < r 
Heart S "does not fulfill condition (I) which requires that 
a lf> it But when q = m, S, does fulfill condition (I), even in 
cases 3 2 and 3.3. Hence only S.SeW* fulfills both conditions 
simultaneously. Thus only minimal winning coalitions are 
realizable. That is, W" = W” 

IV 

The size principle does not by any means solve al the prob¬ 
lems connected with n-person games. Since in a simplerniajo ity 
frame where players are weighted equally, there are 
no^bTe coalmens in W, where t = («), it is apparent that the 
size principle does not narrow the selection down to a unique 
coalmen (As I have shown in (1), pp. 127-39, however, ,n some 
simple majority games where players are weighted unequa y, 
a single 7 member set.) Furthermore, the size principle 
tells us very little about imputations, except that given some 
payoff to ! in S, or S„ the payoff to i in S, is equal to or bette 
Fhan the payoff in S p or S r . Finally, since the narrowing that 
permitted the size principle eliminated games in w ^c 
particular players are specially favored or disfavored, it tel 
us nothing about games in which discrimination is permitted. 
In short there are many non-unique features of predictions in 
n-person theory. Nevertheless, as I tried to demonstrate in ( ), 

the narrowing accomplished here permits some new and 
revealing interpretations of politics. 
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A Mathematical Model of Policy 
Formation in a Democratic Society 1 

OTTO A. DAVIS and MELVIN HINICH 

Carnegie Institute of Technology 

1. Introduction 


It is obvious that there are many factors which influence the 
policies adopted by a democratic government. Close observers of 
the political scene easily can cite instances where the very com¬ 
plexity of the governmental organization allows one part of that 
entity to have policies which serve to frustrate the policies of 
another part. It is equally clear that instances exist in our complex 
system where some policies of some parts of our government are 
unknown to our elected leaders (not to mention the people). There 
is no doubt but that any truly general and complete theory of 
policy formation should explain such anomalies. Nevertheless, they 
are ignored in the developments which follow. Instead of these 
anomalies attention is centered on an idealized situation where 
full knowledge of governmental policy is available to all. 

It is also evident that in a democracy where a government 
enjoys power because it won an election, that government’s 
policies must bear some relationship to the desires of the voters. 
Ihe determination of this relationship is the problem with which 
this paper is concerned. Nevertheless, it should be admitted at the 
outset that the very concept of the “desires of the voters” is some¬ 
what ambiguous. Although it cannot be denied that some mem¬ 
bers of the population (and perhaps all of the relevant popula¬ 
tion for some subset of issues) have clearly defined positions on 
policy evidence reported by various pollsters would seem to indi¬ 
cate that, at least for some issues, the very debate connected with 
an election may have an influence upon public opinion. Partly be¬ 
cause such an influence does not seem to be fully understood this 
phenomenon also is omitted from the model developed here Per 
haps the sole justification of this and the above omissions is that 
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one must learn to walk before one is able to run. Yet, these omis¬ 
sions mean that this paper should be viewed as an effort to study 
only one idealized aspect of the real situation. 

The particular (and main) problem investigated here is as fol¬ 
lows: Given the precisely defined (see the developments below) 
and unchangeable preferences of the voters in the population, 
candidates for public office compete for votes by announcing be¬ 
fore an election their exact position on each of the relevant issues. 
Each voter compares the positions taken by the various candidates 
and casts his vote for that particular candidate whose position is 
“nearest” (a more careful definition is given below) his own most 
preferred position. It is assumed that, once elected, a (former) 
candidate will adopt those policies which he announced during 
the campaign. Thus the questions to be answered are whether, 
and under what conditions, dominant strategies exist for the can¬ 
didates. 

Other problems also are analyzed within this context. For ex¬ 
ample, the policy choice of a beneficient dictator is compared with 
the dominant strategy for two candidates in a democratic system. 
The dilemma inherent in the process of nominating a candidate is 
discussed. Finally, a basic assumption is relaxed to allow for the 
possibility that one portion of the population may not care about 
some particular subset of issues while the other portion feels 
strongly about these very issues. 

2. Basic Assumptions and Tools of Analysis 

In order to be able to handle these basic problems, it is neces¬ 
sary to make some simplifying assumptions. First, it must be pre¬ 
sumed that, at least conceptually, policies can be measured by 
certain indices. Consider, for example, the issue of civil rights. One 
might use several indices to measure the various characteristics 
of this issue. Voting rights might be measured by the percentage 
of the adult, nonwhite population which can be registered to vote. 
Integration in the schools and in housing might be measured by 
the variance of the percentages of nonwhites attending the various 
schools and living in the various localities respectively. Job dis¬ 
crimination might be measured by the percentage of nonwhites 
employed in various categories of work. On the other hand, one 
might use an index of these various indices. The crucial point is 
that some type of measurement be admitted. 

Granted that policies can be measured by the postulated indices, 
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another (even stronger) assumption is now appropriate It is that 
each voter in the population uses the same JlJto m 1^2 ^ 
£ven policy. In other words, the indices measuring the various 
policies are common to all voters. It is apparent thafthis assump- 

policv ^ ha * SmCe * he n f mber of variables which measure any 

assumed to •“ TOtes in the ™ 

of * *» 

It is assumed further that each voter has a preferred position for 

certafa^!luL PO f' rt! ^ P^ ferred P° sition can be represented by 
certain values of the variables which measure each policy Con 

sequently, the i» voter's preferred position on all theTsutS 
pohcy can be represented by the vector 

Xl=[Xii, X J2 , . . . , Xi„]' 

rf"he indicr PO h n T ° f ^ Ve ?° r * repreSent 4116 desired values 
ot the indices which measure the given policies." Thus x, mitrht 

Th r P T ntage 0f the “te P°pulation S 

can be registered to vote; *, and x„ might measure respectfaly 

the variance of the percentages of nonwhies attending the TS 
schools and hying in the various localities; etc § 

In a manner similar to which the preferred position (or point) 
of an individual voter is represented, the vector P 

0j = [Oii, # j2 ,..., 0 3n y 

c^<hIre k Tb t0 T 65 ^ 4116 P ° Si4i0n (or Worm”) of the J- 

fo^ I T” r° 40r “ P resumed ‘0 be announced be- 

lore any election and is known to all voters. 

Since only in a degenerate case could x 1 =6> j for all i, some pro- 

prefared pot” !" ^ “ loss ” whi ob any voter feels whJhb 
preferred pohcy position ,s not the one selected for enactment 

Such provision can be accomplished by the introduction of in¬ 
dividual loss functions. Obviously, loss functions should exhibit 

m^fa M d T :abIe Pr ° perties - Let 0 represent govern- 
mental policy. Then 0 is a vector composed of the indices dis- 

cussed above. For the moment, view the components of 0 as vari- 
ables. Consider the i th voter. Obviouslv if v =/j .* . . y. 

viduals loss should be zero since governmental policy is the sTme 
as his preferred position on all issues. However consonant with 
the notion that each individual does have a preferred position!^ 

tk. equality deuce. 






178 


MATHEMATICAL APPLICATIONS 


each issue of policy, then if x, ^ 9 the i* k voter should have a posi¬ 
tive loss. These properties are present in the following speci ca- 
tion of the i th voter’s loss function: 

(2.1) Li(0) = (xi — 0)'A(Xi — 0) 

where Li represents the loss function and A is a symmetric, posi¬ 
tive definite matrix of rank n. . 

Observe that (2.1) is a quadratic form. Obviously, the speci - 
cation of this specific form requires further justification, since other 
functions possess the two properties discussed above, 
quadratic form is the simplest of the class of functions havi g 
these properties and it is preferable, other things being equal, on 
this basis. Second, a loss function has an obvious relationship to 
the economist’s notion of a utility function and, m fact, a T uadra £c 
loss function can be derived from a quadratic utihty function. The 
basic notion underlying utility analysis is that of declining mar¬ 
ginal utility. A quadratic utility function incorporates this> cone.sp . 

It follows that a quadratic loss function is acceptable on this basis. 
Third, it can be argued that no matter what the true loss func¬ 
tion (at least if it incorporates the properties specified m the above 
paragraph), then a quadratic can serve as an acceptable approxi¬ 
mation. This argument can be based upon expanding the function 
in a Taylor’s series, noting that the first order terms are zero if the 
loss is symmetric, and throwing away the third and higher order 
terms. Finally, the authors argue that the proof of the pudding is 
in the eating and that intuitively interesting and informative re¬ 
sults can be derived on the basis of quadratic losses. 

For the special case of n=l (one issue with a single index) the 
loss function (2.1) is plotted in Figure 1. Note that it is symmetric 

around the point of zero loss (Xi=0). 

It should be pointed out that the matrix A in (2.1) is not given 
a subscript. The reason for this omission is a rather strong assump¬ 
tion. Although the components of the vector x 4 can assume any 
values which the i th individual might desire, it is presumed that 
the tastes of the voters are such that the matrix A enters the oss 
function of each individual. The population of voters is assumed 

3 t> v definition a nXn matrix A is symmetric if it is equal to its transpose; that 
■ i{ l - A '. The assumption that A is positive definite is a sufficient condition f 

^ «• G - Hidley - Limr Akrh ‘ 

(Reading: Addison-Wesley, 1961), pp. 251-63. 
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In words ’ aith ° u s h v«« 

same relative “weight” (or XpoXreX<7a ^ V ° t6rS aSSig " ^ 
(admittedly unrealistic! XX } , any S Iven issue - This 
of analytical convenience It > aX mat ^® s °I e fy for the reason 
that since utiC W rne ^ "X ° n the 0ther hand, 

individual scale of measurement matters with no inter- 

be exercised in attaching any steni£L to . care must 

numerical values of the losses JZrT comparison of the 
ever, the very notion that utility f ° gIVen mdividuaIs - How- 

monotonic transformation X ^ *° a 

zation to the assumption that the matriXX rati ° naIi - 

Uon of each individual. At least foTX r T * e loss of W 
able transformations could he j aSS °/ oss Emotions, suit- 

this assmnption 411656 *”**»“ S ° tha ‘ 

voterfGCrdft S e t p3f Iem ° f *?“**&* *e population of 
by presuming that the prrfem^S*’ X < T, n b ® accom plished 
Plotted into an n dimSt^andftlT X b66n 
has been suitable normalized into a density fXm-I XT 7 
shy is naturally discrete, for the most n tIus den ' 

by a continuous density. It should he" t a"'! e approximated 
characterizing the powlaHon ^ ^ method of 

r» l ,b,v,£» 7 ,rx t ^”,:;„r*- 

(2.2) Ex^« ° f D0tatlOn ’ * is P res umed that 

S t7h?veXXet^::l n me ta X S « ^ S ° ^ 

of the Xj. Also, mponents are the means of the components 

(2.3) E(x — S)(x — 8)'=$ 

-— 

repXXTXp^tX.^f 011 1 *h 6 n0m - Let . 2 

respect to the matrix A is defined as follow^ ° 6 VeCt ° r Z With 


(2.4) 


Vz'Az 


4 See, however, the discussion of Section 6 where thi, a „ • • 

where this assumption is modified slightly. 
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sents the “distance” betw een two vectors. For example, the norm 

(2.5) II zi —z 2 || = V(zi — z*)'A(zi — z 2 ) 

represents the “distance” between^vectors *, and^ with r^pec ° 

the matrix A. In the development be^ ^IThed ly the 
the matrix A$A is used also. This norm & 

notation __ 

(2.6) || z ||* = Vz'A^Az 

3. Two Candidates and a Beneficent Dictatorship 
It is convenient and 

^o/XiHs X*e 

desires to choose a vector 0 such that the express 

/oix v(x — 0YMx — 0)=E x — 0 || — tr$A + || » 0 Ml 
• • vpd 5 It is clear that this expression is at a minimum when 

0other words, granted the dictator’s 

value judgment, and also 

the preferred positions of the individuals m the population. 

Turning now to the ease of two-candidate competitron i n 
democratic society, it is convenient to begin by stating 
candidates will be called “one” ant W 

denotes the platform of the 1“ candidate and ft represents me 
platform of the 2»* candidate. These platforms are announced be- 

*!„ *» notion, (s-ft'A(,-0) =11 

- *»r Ltt B 

an nXn matrix whose elements are denoted u - tr B — ^ b u . 

i = 1 

In other words, the trace of B i. the sum of the diagonal elements. 
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fore the election day and form the basis for the voters’ choices 
between the two candidates. (Recall the convenient assumption 
that the elected candidate will honor his platform.) Essentially, a 
voter is assumed to choose that candidate whose platform gives 
the smallest utility loss. In other words, the i th voter will cast his 
ballot for the 1 st candidate if 

(3.2) (x l -0 1 )'A(x l -0 1 ) < (x l -0 2 )'A(x i -0 2 ) 
and it obviously follows that if 

(3.3) (xi — 0 1 )'A(x i — 0i) > ( Xl — 0 2 )'A(Xi — 0 2 ) 

the 2 nd candidate will receive the i th individual’s vote. In the un¬ 
likely event that the utility losses are the same, it can be presumed 
that the voter makes his choice by flipping a fair coin. 

Having developed the rules for a voter’s choice of candidates, it 
is appropriate to consider the relationship between this analysis 
and the works of Hotelling, 6 Downs, 7 and Tullock. 8 The unifying 
elements in the relevant parts of these works are two presump¬ 
tions. First, there is only one index of policy. Second, distance can 
be used to determine how a voter will cast his ballot. Thus in the 
terminology of this analysis, let n=l. Then a representative loss 
function is presented in Figure 1. Given the previous assumptions, 
this function must be symmetric. It follows that (3.2) obtains if 
and only if | x t — 0 a | < | x ; — 0 2 | and (3.3) obtains if and only 
if | Xi — 0i | > | Xi — 0 2 | . In other words, a voter chooses that 
candidate whose platform is nearest to his own (the voter’s) pre¬ 
ferred position. 9 

Consider the number 0* which satisfies the following conditions: 

P(x<0*)<& 

(3.4) 

P(x>0*) 

where P represents “probability.” In other words, 0* is the (not 
necessarily unique) median of f(x). 

Consider now the problem of the choice of platforms. Suppose 
that the 1 st candidate selects the platform 01 = 0 * and the 2 nd can¬ 
didate selects some platform 0 2 ^0* where 0* represents any 


6 Harold Hotelling, “Stability in Competition,” Economic Journal, XXXIX (1929), 
41-57; reprinted in G. J. Stigler and K. E. Boulding (eds.), Readings in Price Theory 
(Chicago: Richard D. Irwin, 1952), pp. 467-484. 

7 Anthony Downs, An Economic Theory of Democracy (New York: Harper, 1957). 

8 Gordon Tullock, The Politics of Bureaucracy (Washington: Public Affairs Press, 
1965). 

9 The notation1.1 denotes “absolute value.” In a single dimension, this is a measure 
of distance. 
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number which satisfies (3.4). Put another way, the 1 st candidate 
chooses a median strategy while the 2 nd candidate selects a non¬ 
median platform. Given these choices, it is clear that (under the 
assumptions) the 1 st candidate will win the election. In other 
words, the median is a dominant strategy. A choice of the median 
insures a candidate of at least an even chance of winning. 

In order to justify this theorem, it is sufficient to observe that 
the very definition of the median (3.4) insures the 1 st candidate 
of having a platform nearer to the preferred positions of at least 
one-half of the voters than the platform of the 2 nd candidate. This 
fact is also obvious in Figure 2 where a density f(x) is drawn, 0* 
represents the median, and 0 2 =4= 0* is an arbitrary choice of the 
other candidate. 

Given the presumed voting rules (3.2 and 3.3), it is clear that 
the best that the 2 nd candidate can do is also to select a median 
strategy 0 2 =0*. In this event both candidates have an even chance 
of winning. 

The dominance of the strategy of playing the median means that 
insofar as candidates are interested in winning the election, they 
should try to achieve this “middle position.” Non-median strategies 
are to be avoided, for they only invite defeat. (At least under the 
assumptions made here, which implicitly include the presumption 
that all qualified individuals vote.) 

Contrast this result with the presumed choice of a beneficent 
dictator. When the density f(x) of preferred points is such that 
the mean and median coincide, then the dominant strategy for a 
candidate is the same as the beneficent dictator’s choice. How¬ 
ever, if the density f(x) is skewed so that the mean and median 
are not the same, then the choices differ. 

The question arises as to whether this result can be extended. It 
is particularly interesting to inquire as to whether anything can 
be said when the number of components in the vector x is greater 
than one. In this regard, let n > 1 be an arbitrary integer. This 
means that f(x) is a multivariate density. It is necessary to per¬ 
form a certain amount of algebraic manipulation to get the voting 
rules into a form which is useful for analysis. 

Consider the instance in which the i th individual votes for the 
1 candidate so that (3.2) is presumed to obtain. Dropping the i 
subscripts for convenience, it is easily seen that (3.2) can be ex¬ 
panded into the following equivalent statement 

(3.5) x'Ax — 2x'A0* + 0/A0i < x'Ax — 2x'A0 2 -f 0 2 'A0 2 
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smee x'A0x = 0/Ax and x'A0 2 = 0/Ax. By taking 0/A0 2 to the left 
hand side and 2x'A0 1 to the right hand side, the expression 

(3.6) 0/A0t — 0/A0 2 < 2x'A(0i — 0 2 ) 
is obtained. This can be written as 

(3.7) ( 0 t -f 0 2 ) 'A(0i — 0 2 ) < 2x'A( 0t — 0 2 ) 

subtracting 28'A(ft - ft) from both sides, the expression 

(3.8) (0i -f- 0 2 )'A(0t — 0 2 ) — 28'A(0t — 0 2 ) < 

• i , 2(x — S)'A(0i — 0 2 ) 

15 obtained; Examine the left hand side of this expression (3.8). 
The following is simply an algebraic manipulation. 

(3.9) (0i -|- 0 2 )'A(0 i — 0 2 ) — 2S'A(0i — 0 2 ) = 

( 0 , + 0 2 2S)'A(0i -0 2 ) = [(0i - 8) + (0, - S)]'A(0i - 0 2 ) 

viously, simultaneously adding and subtracting 8 does not alter 
the value of this expression. Thus one can write (3.9) in the form 
[(0i — S) + (02 — S)]'A[0i — 8) — (0 2 _s)] = 

(3.10) (0i — S)'A(0i — S) — (0 2 — S)'A(0 2 — S) = 

|| 0i-8 || 2 -|| 02-S ||- 

and the last part of this step is nothing more than the notation 
introduced in (2.5). It is easily observed from (3.10) and (3.8) 
that if (3.2) obtains, then 

(3.11) 2(x — S)'A(0i — 0 2 ) > || 0i — 8 || 2 — I! 0 2 — 8 II 2 


individual votes for the 1 st can- 


also holds. In other words, the i 
didate if (3.11) is true. 

F <> r the moment, consider x to be a vector selected at randon 
from f x) Then it is useful to know the mean and variance o 
one half the quantity on the left hand side of (3.11). 

E[(x — S)'A(0i — 0 2 ]=O 

(3.12) 

TU flT Var[ i X - S )' A (^-^)] = (^~0 2 )'AM(0i-02) 
The fonowmgdefinition (see (2.6)) is simply a matter of notation 

r 3 i 3) v (?^ 2 )W-0o=ii 0i — 02 i* 

n ot her words, || 0i — 0 2 ||* is simply the standard deviation oJ 

x / ( 1 ^ 2 ' when x is considered to be a vector selectee 

at random from f(x). 

Consider the following definitions. 

y- (x — 5)'A(01 — 0 2 ) 

(3.14) 


II 0i - 

0i ~8 


02 


Then the statement 
(3.15) y > t 


0X - 0 2 
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is equivalent to statement (3.11). In other words, those individuals 
for whom (3.15) is true will vote for the 1 st candidate. 

Expression (3.15) is useful for analysis. It is desired to investi¬ 
gate the possibility of the 1 st candidate being able to select his plat¬ 
form (policies) 0i in such a manner that he is certain to win the 
election if 0* ^ 0 2 . (Note that if d 1 =d i , then neither (3.2) nor 
(3.3) obtain and the election is equivalent to tossing a coin.) Con¬ 
sider selecting a voter at random from the population f(x). If 

(3.16) P[(x — 0i)'A(x — 0i) < (x — 0 2 )'A(x 0 2 )] > /2 

so that more than one half of the voters in the population obtain 
a smaller utility loss from the 1 st candidates platform than from 
the one of his opponent, then the 1 st candidate is certain to win 
the election. The previous analysis shows that 

(3.17) P(y > t) > /2 

is equivalent to (3.16). Furthermore, if f(x) is continuous so that 
for 0i 02 

(3.18) P[(x - 0i)'A(x - 0i) = (x - 0*)'A(x - 02)] =0 
then the 1 st candidate wins if and only if (3.17) obtains. 

It is now necessary to inquire into the conditions under which 
(3.17) is true. Suppose that f(x) is a multivariate normal density 
with mean vector 8 and variance-covariance matrix % Then it is 
clear from (3.12) and the definition (3.14) of y that y has a 
standard normal distribution. Thus (3.17) is true if and only if 
t < 0. 

Examine the definition (3.14) of t. Suppose that the 1 st candi¬ 
date selects 0i=S. Obviously, || 8 - 8 ||=0. Then for any choice 
of the 2 nd candidate such that 0 2 ^ 8, 11 0 2 — 8 11 > 0. It follows 
that t < 0 so that (3.17) is true. In other words, if the 1 st candi¬ 
date selects the policies in his platform to be exactly the same as 
the mean of the policies desired by the individuals in the voting 
population, and the other candidate does not make the same 
choice, then the 1 st candidate is certain to win the election. Con¬ 
versely, suppose that the 1 st candidate selects 0i ^ 8. Obviously, 
|| 0,-8 II > 0 in such an instance. If the 2 nd candidate selects 
d 2 =8, then || 0 2 — 8 || =0 so that t > 0. Thus 

(3.19) P(y > t) < /2 . 

is true so that the 2 nd candidate is certain to win the election. 
Finally, if both 0i=8 and 0 2 =8, then it is obvious that a tie is ex¬ 
pected. The following theorem is established: 

Theorem: 3.1: Given the assumptions resulting in voting rules (3.2) and 
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(3.3), then, if the density of preferred points f(x) is normal, the plat¬ 
form 0 = 8 is a dominant strategy. 

The fact that 0=d insures a candidate of winning the election if 
the opposing candidate does not make the identical choice of 
selecting his platform to be the vector of means of the preferred 
positions, and gives the expectation of a tie if both candidates 
choose the vector of means, indicates that there should be a ten¬ 
dency for wise candidates to select such policies for their plat¬ 
forms. It is interesting to note that insofar as this tendency is ob¬ 
served, then the competition between candidates in a democratic 
process tends to produce the policies which a beneficent dictator 
operating under (3.1) would select. 

The above result depends upon the assumed normality of f(x). 
Since the actual population of voters in any given country is neces¬ 
sarily finite, this assumption means that the presumed normal f (x) 
is an approximation to the actual density. Now for many cases this 
approximation will be sufficiently good. Further, one can argue 
that even if f(x) is not assumed to be a normal density, y can still 
be approximated by a standard normal in many instances. Yet, one 
may wonder whether it is possible to say anything when the dis¬ 
tribution of preferred points f(x) is not known and no approxima¬ 
tions are allowed. The answer is affirmative, at least in the sense 
that certain bounds can be derived. These bounds are stated in 
terms of relative deviations from the vector 8 of the means of pre¬ 
ferred points, and they indicate the powerful influences of the 
means upon the policies produced by the democratic process. 

Let y and t be defined by (3.14). By beginning with (3.3) and 
performing steps (3.5 — 3.14), it is seen easily that 

(3.20) y <t 

is equivalent to (3.3). Therefore, those voters for whom (3.20) 
obtains cast their ballots for the 2 nd candidate. Without a loss of 
generality, consider the case in which 

(3.21) || 0!-8 || > ||0 a -8|| 

so that t > 0. In other words, the 1 st candidate’s platform is a 
greater “distance” from the mean vector of preferred points than 
is the platform of the 2 nd candidate. Noting that E(y)=0 and 
Var(y) =1, it follows from Tchebyshev’s inequality that 

(3.22) P(y <t) >l-l/t 2 

since the one-sided version of this inequality cannot have a smaller 




186 


MATHEMATICAL APPLICATIONS 


probability of being true than the two-sided one . 10 Further, from 
the definition (3.14) of t, it is obvious that 

, 3 ^ 1 - 4iio,-o,ir 
' ; e (|| 0 1 _s|j ! -||e.-s||*) ! 

For the purpose of the argument, || 0i — 0 2 ||* must be re¬ 
placed by a more convenient quantity. Recall that utility is defined 
uniquely only up to a monotonic transformation. Thus it can be 
assumed, without loss of generality, that % < A -1 . If this were not 
so, then A could be multiplied by a positive scalar to make it so 
without altering any of the analysis or changing anything. It fol¬ 
lows that the presumption A$A < A is legitimate for the purpose 
of analysis. Thus 

(3.24) || ft-ft ||* <|| ft-A || 

follows from this assumption, the definition (3.13) of || 0 i — 0 % ||* 
and the definition (2.5) of the norm • [ | . Also, 

(3.25) ||0 1 -0 2 ||<||0 1 -8|| + 0 2 — S 11 
by the triangle inequality . 11 Noting that 

(3.26) ( || 0! — S | 2 — || 0 2 — S || 2 ) 2 — 

( 11 0! — S | +|| 0,-8 || ) 2 ( II 01 — S II — II 02 — s ||) 2 
one can use (3.24) and (3.25) to write 

(327)— <_ 4([I 0i — 8 || + || 0 2 — 8 ||y _ 

^ ' t 2 ^ (|| 0x-8|| + || 0 2 -S ||) 2 (|| 0,-8 11-110,-6 ||) 2 

Cancelling the common term in the numerator and denominator, 
one can use (3.27) to write (3.22) in the form 

(3.28) P(y < t) > 1 - (|| 0 1 — § n _^|| — g ||). 

so that if 

(3.29) 11 0i — S 11 — 11 0 2 — S 11 > 2V"2- 

then 

(3.30) P(y<t)>l/2 

so that the 2 nd candidate receives more than one-half of the votes. 


10 Tchebyshev’s inequality can be stated as follows: Let z be a random variable 
with mean o and standard deviation CF■ Then 

p< I z - 8 1 ^ t) l - o- 2 /k 2 

where k is an arbitrary positive number. See, e.g., S. Ehrenfeld and S. Littauer, Intro¬ 
duction to Statistical Method, New York: McGraw-Hill, 1964, pp. 132-133, for a 
proof of an alternative form of Tchebyshev’s inequality. 

11 An intuitive understanding of the meaning of the triangle inequality can be 
gained with recourse to the following example: Let u, r, and s denote three points in 
space. (One may think of the points u, r, and s as being the three vertices of a tri¬ 
angle.) Then the distance between any two of the points, say u and r, must be less 
than or equal to the distance between u and s plus the distance between s and r. 
For a proof of the triangle inequality, see P. R. Halmos, Finite Dimensional Vector 
Spaces (2d ed., Princeton: D. Van Nostrand Co., 19S8), pp. 12S-26. 
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From (3.15) and (3.14) it can be seen that 
ft —S '" 

2 11 01 — 02 


= t 


|| 

• ^ • alLt to Is 2) due * to the definition (3.32) of 0i and the 

- before (and lo r 0 

that kA- so thaf A*<A, then it follows from 
the definitions (2.4) and (2.6) of the two norms that 

(3.35) || ft - ft ||* || - e ‘ II - II ~ S " 

due to the definition (3.32) of ft. Therefore 

( 3 . 36 ) ( 1 / 2 ) V7 ^=sym=sr 

and, noting the definition (3.32) of ft, the right hand side of 

(3.36) is equal to __ 

/qq r 7 \ (U) V (?■ l'Al “ (| e l/2) V l'Al 

j « j i / o qq \ t P t e represent the minimum eigen- 
where 1 is defined by (3.38). Let e n represent. , , 

vdue of the nXn matrix A." Then e. > 0 due to the fact that the 
matrix A is assumed to be positive definite." Also, for any posi i 
definite matrix A and any n component vector z 

(3.38) z'Az^e* .£ Zi 2 
so that for the case in point 

since s^Lfof ™e is one and there are n ones in the vector 1. 
SubSTk) into the right hand side of (3.37), it is easdy 
seen from (3.36) that 

(3.40)- J,g ~ 

Net n -* J in such a manner that the n X n matrix A remains posi¬ 
tive definite so that the e„ are bounded away from the origin. The 
lim I e 1 


and 3 froi n thfr Lit definition (3.34) of t, and relationship (3.40), 
t — oo as n oo . Therefore 

■■An eigenvalue can be defined a. follow: Let B repreaen. an »X» 
a an „ coS««nt Ota- «««■ C »"“ d “ th = " kt ‘“ !h ‘ P 

t £ t, "R Js some value of the scaler \ for 

where \ is a scalar An eigenvalue of Ae ““ matrix 0 f order n has at most n dis- 
which this relationship obtains for z ^ . , i h the sma ll e st of the 

tinguishable eigenvalues. The discussion above is concerned with 

eiS T^ei Al & ebra < ReadInS: Addison ' Wesley ’ 1961)> ^ 256 ‘ 


•V" 


= 00 
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(3.42) P(y > t) 1 as n—>- oo 

where the 1 in (3.42) is the number one. The following theorem is 
established: 

Theorem: 3.3: Given the assumptions resulting in voting rules (3.2) 
and (3.3), and given that the platforms of the two candidates are de¬ 
fined by (3.32), then if n oo while the n X n matrix A remains positive 
definite, the fraction of the total vote going to the 1st candidate ap¬ 
proaches one. 

This theorem indicates the power of the influence of the vector 
8 of means of the preferred positions. It also has a number of inter' 
esting interpretations. One might infer, for example, that as the 
population becomes more sophisticated in the manner in which 
policies are viewed, and as the number of issues of policy in¬ 
creases, then the chance of an extremist candidate winning the 
election goes down no matter what the density of preferred points. 

4. Candidate Selection by Primaries and a General Election 
The analysis of Section 3 ignored the phenomenon of political 
parties. Certainly, the mere fact that parties select the candidates 
who run in the general election may place restrictions upon the 
strategy or platform which the candidates can choose. Even when 
the terms strategy and platform (used interchangeably here) are 
defined to mean “that for which the candidate stands” (rather 
than the formal documents drawn up by the U.S. parties), it must 
be admitted that in a sense the candidate “represents” the party. 
Consequently, it is of interest to examine a situation in which a 
candidate has first to win the nomination in his own party and then 
must compete in the election on the basis of the same strategy 
(platform) which won for him the party’s nomination. 

Let the totality of registered voters be divided into two mutually 
exclusive and exhaustive populations (parties) which are denoted 
“one” and “two” respectively. 14 Let Wi represent the preferred 
position of the i th voter from the 1 st population and vj the preferred 
position of the j th voter from the 2 nd population. Also, represented 
by L(w) and f 2 (v) the respective densities of the preferred posi¬ 
tions of the voters of the 1 st and 2 nd populations. The means of 
these densities are defined by 
(4.1) E(w) =8i ; E(v)=8 , 
and the variance-covariance matrices are defined by 


14 Note that this exhaustive division means that no independent voters are allowed. 
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(AO) E(w — S0(w — 8 1 y=% 

K * E(v — S 2 ) (v — d 2 )'—% 

where each % is nXn. If 0 represents a pohcy vector, then let 

/ 4 on Lii ( e ) = (wi — 0)'A(wt — 0) 

{ } L«(0) - (vj — 0)'A(Vj — 0) 

represent the respective loss functions of the i th and j th voters from 
the 1 st and 2 nd populations. Note especially that the n X n posi¬ 
tive definite matrix A is commmon to all voters in both populations. 
Of course, it is important to observe that this does not prevent 
wide differences in taste from existing between the two homo¬ 
geneous populations since no restrictions are placed upon the 
preferred positions (the Wi and vj) of the voters in the popula¬ 
tions. Differences between the two populations will be discussed 
in terms of the parameters defined by (4.1) and (4.2). Finally, it 
is assumed here, as in the previous section, that 

(4.4) A" 1 

which, as was explained earlier, is no restriction due to the fact 
that loss functions are uniquely defined only up to a monotonic 
transformation. 

The analysis here is developed under the assumption that a 
purely democratic process produces the nominations. This pre¬ 
sumption represents something of a departure from reality, at 
least for the U.S. where conventions have the responsibility for 
candidate selection. 15 Yet, it is informative to assume that the 
candidate really does ‘represent” the party in the sense that he 
is the winner of an all inclusive within-party election. 

By boldly making this assumption and also by presuming that 
within any party the number of candidates is always two, the 
analysis of Section 3 can be applied to the nominations. Thus it 
is assumed that the candidates have platforms which are the 
means of the preferred points of the members of their respective 
parties. Accordingly, let 1 st and 2 nd candidates be the respective 
nominees of the 1 st and 2 nd parties. Then 

(4.5) 0 1 = 8 1 ; 0 2 =S 2 

are presumed to be the respective platforms of the two candidates. 

There remains the problem of specifying the voters’ rule of 

15 The possibility of "bias” is easily seen by adopting Buchanan and Tullock’s 
argument concerning representation to the process of nomination by convention. J. M. 
Buchanan and G. Tullock, The Calculus of Consent (Ann Arbor: University of Michi¬ 
gan Press, 1962), pp. 217-22. 
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choice between these two candidates in the general election. 
Ignoring party loyalty, it is presumed that the i th and j th individuals 
vote for the 1 st candidate if 

(46) (wi-^'ACwi-ft) < (Wi-flO'AfWi-fl,) 

(vj ~ 0i)'A( Vj - ft) < ( Vj - 0 2 ) , A(v j - 0 2 ) 
holds and for the 2 nd candidate if 

(4 7) (wi-^O'ACwt-^) > (wi — 0 2 )'A(w 1 — 02) 

(vj - 0O'A( Vj - 00 > ( Vj - 0O'A( Vj - 00 
obtains. 16 Recalling (4.5), it is clear that the voter’s choice depends 
upon the two vectors of means 8 t and 8*. Since 8 1 ^ 8 2 is assumed 
always, and since f x (w) and f 2 ( v) are viewed as being continuous 
densities, there is no problem in ignoring the possibility of some 
voter being faced with equal losses from the two platforms. 17 

Once again, it is desirable to get these voting rules into a form 
more amenable to analysis. By performing the operations exhibited 
in (3.5-3.10) and recalling that d 1 = 8 1 and 6 2 = 8 2 as stated by 
(4.5), one obtains 

(48) 2 ( w “ s 0 , A(6 1 -8 2 ) > - 

2(v — S 2 ) , A(S 1 — S 2 ) > || Si — S 2 || 2 

as expressions equivalent to those of (4.6). Note that the i and j 
subscripts are omitted for convenience. It is obvious that 

/IQ) E[(w — 8i)'A(8 1 — 8>)]=0 

1 E[(v — S 2 ) , A(S 1 — S 2 )]=0 
and it can be shown that 


<“> zii:- 

Define as in. (2.6) 


- Si)'A(Si 
S 2 )'A(Si - 


- 8 2 ) ] — (8i 

62) ] = (8i - 


-8 2 )'A% 1 A(8 1 

8 2 )'AM(Sx- 


- S 2 ) 

8 .) 


(4.11) V (57— 8 2 )'A%A ( 8 t — 8 2 ) = || 8t —8, Hi* 
, V( 8i~— 8 2 )'AM (8, — 87) = 11 8 t — S 2 1| 2 * 

and 


(4.12) 


„ _ (w — 8 1 )'A(8 1 — 8 a ) 
7 ‘ P,-8. H. 5 

(v — 8 a )'A(8 1 — S 2 ) 

IT8i-8.ii.* 


It mi g ht: be noted that, at least with some interpretation, the voting rule need 
not confliet with the notion of party loyalty. See the discussion in Chapters 3 and J 

^’4 E , E - “ d D - E -^ Th ‘ Amer,CM Vot ” 

, 17 ° f C t urse ’ • 11 1S eaSy t0 take the equal loss possibility into account by assuming 
that in such an instance a voter will choose the candidate of his own party. The point 
is that this additional assumption does not alter the results of the analysis. 
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so that 

(4.13) E(yO=0 ; E(y 2 )=0 
and 

(4.14) Var(yi)=l ; Var(y 2 )=l 
Also define 

118 t- 8 . 11 * 

2 || — 8, ||x* 

(4J5) ]| 8 ,- 8 . ||' 

ts= 2 || 8i — 8 a || 2 # 

It follows that 


(4.16) y 1 > b ; y 2 > t 2 nd 

is equivalent to (4.6). In other words, voters from the 1 and 2 
parties respectively cast their ballots for the 1 candidate if and 


only if (4.16) obtains. 

It is necessary to obtain an expression for the portion (fraction) 
of the total vote which the 1 st candidate receives. Let a represent 
that fraction of the total number of voters belonging to the 1 st 
party. Then 1 — a represents the fraction of the total number of 
voters belonging to the 2 nd party. Imagine selecting a voter at 
random from each of the 1 st and 2 n populations. Then 


(4.17) R=aP(y 1 > ti) + (1 —a)P(y« > t 2 ) 
represents the fraction of the total vote going to the 1 candidate. 
Obviously, 1 — R is the fraction of the vote going to the 2 nd candi¬ 
date. Thus the 1 st candidate wins the election if R > % and the 2 nd 


candidate wins if R < 

Recall that the norm || Si — S 2 1| can be interpreted as the “dis¬ 
tance” between the mean vectors of the two populations. It is of 
interest to determine the effect of increases in this distance. 

From assumption (4.4) and the definitions (2.4) and (4.11) of 
the two types of norms under consideration here, it follows that 

(4.18) || Si — S 2 1| 7> || Si —S 2 || k * , k=l,2. 

This means that 


t t < — 


Si — S 2 


< 4 ' 19 ) " H8.-8.il 

t 2 > 2 

Let distance between the two mean vectors increase. As 11 Si S 2 


->oo, then from (4.19) t x —oo and t 2 -><*> so that P(yi > ti)-> 1 
and P(y 2 > t 2 )->0. From (4.17), R -> a. Thus as the distance 
between the mean vectors of the two parties increases, voters tend 
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to stick more and more with their own parties until in the limit the 
minority party has no chance and the majority party always wins. 
One can speculate that such a situation, where there are large 
differences between the (opposing) desires of the two groups 
and the minority has no chance of exerting any influence upon 
policy is not very conducive to the continuation of a democracy. 
It is plausible to believe that conflict is likely to result and it is 
interesting to ponder real world situations such as the Cyprus prob¬ 
lem in the light of this result. 

It is appropriate to consider the relationship between the man¬ 
ner in which the total vote is divided and the parameters % and 
$ 2 . Letting j| 8 t — S 2 11 be a finite number, suppose that 

(4.20) ^ 

so that the 1 st party is allowed to represent a “wider range” of taste 
or opinion than is the 2 nd party. Granted this greater spread of 
preferred points, it is interesting to determine the conditions under 
which the I s party’s candidate can win the election. 

Let both fi(w) and f 2 (v) be multivariate normal densities. Then 
it is easily seen from the definition (4.12) that both y, and y 2 are 
normally distributed with zero means and unit variances. Define 


8 1 


(4.21) 


fit 

8 1 


2 || Si — S 2 1| 2 * 

so that by the symmetry of the unit normal distribution 


(4.22) ] P(y.>t)=P(yx<fe) 

P(y 2 > t 2 ) = P(y 2 < k 2 ) 

and (4.17) can be written equivalently 

(4.23) R=aP(yi < kx) -f (1 — ce)P(y 2 < k 2 ) 

Note that k, > 0 so that P(y, < k.) > * and k. < 0 so that 

P(y* < k 2 ) < %. 


If the I s candidate is to win the election, then it must be the 
case that his fraction of the vote is greater than one half (R > %). 
Making this assumption, one may obtain from (4.23) 


(4.24) a > 


8 — P(y 2 < k 2 ) 


>0 


P( yi <ki)-P(y 2 <k 2 ) 

Granted the assumption (4.20), it is easily seen from the definition 
(4.11) of the starred norms that 


(4.25) || Si — S 2 ||i* ^ || Si — S 2 || 2 * 
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From (4.25) and definition (4.21) one observes that k x < — k 2 so 
that 

(4.26) (Pyi < ki) <P(y 2 < -k 2 )=l-P(y*<k 2 ) 
and by substitution 

% — P(y 2 < k 2 ) ^ ?2 — P(y 2 < k 2 ) 

( 4 - 27 ) "p( yi < kx) - P(y 2 < k 2 y ^ T^2P(y 2 <M 
It follows from (4.27) and (4.24) that « > X. In other words if 
the 1 st party is more “dispersed” than the 2 nd party in the sense that 
its members have more divergent points of view, opinions, and de¬ 
sires for policies; if both parties choose candidates whose respec¬ 
tive platforms represent the party’s vector of means of the pre¬ 
ferred positions of its members; and if the densities of preferred 
positions are normal; then the 1 st party can win the election only 
if it is the majority party. Obviously, the converse of this state¬ 
ment is also true. If the 1 st party is a minority (a < %), and it 
/ 4.20) obtains, then the 1 st candidate loses the election. 

The above discussion makes clear the fact that the minority 
party can win under certain conditions. Therefore, one might be 
interested in determining when a minority triumph can take place. 
Note that if (4.24) is true, then it is implied that 

(4.28) X < «P(yi < k x ) + (1 - a) P(y 2 < k 2 ) 
so that the candidate of the 1 st party must win the election. There¬ 
fore, it is important to investigate whether and under what con¬ 
ditions (4.24) can obtain when a <)L 
Let it be assumed that a < X and 


(4.29) %<% . 

so that the 1 st party is more “cohesive” than the 2 one in the sense 

that it represents a “smaller range” of taste and opinion about 
policy. Then the above analysis would tend to indicate that it is 
possible for the 1 st party to win the election. In order to explain 
easily why this can be true, allow the following somewhat more 
stringent, assumption to be made. 

(4.30) c> 1 _ 

Granted condition (4.30), definitions (4.11) imply 

(4.31) c || 8 a || 1 * = || & -&H 2 * 

and applying (4.31) to definitions (4.21) yields 


(4.32) k 2 =— ki/c 

so that substitution is possible. Noting that ki > 0 so that P(yi < 
ki) I 2 , let c — 00 . Then — ki/c —0 so that P(y 2 ki/c) 
X. Applying these results to (4.24) gives 
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(4.33) a > 


P(y 2 < — ki/c) 


—0 


P(yi < ki) — P(y 2 < — Wc) 
so that a < % is certainly possible when (4.24) obtains. Since 
(4.24) implies (4.28), the candidate of the 1 st party wins the 
election. 


An intuitive understanding of the above result can be obtained 
by recourse to a simple graph. Assume a single index of a single 
issue so that n=l. In Figure 3 the densities fi(w) and f 2 (v) are 
plotted and the means (the respective candidates’ platforms) are 
appropriately indicated. Note that the variance of the density of 
preferred points for the 1 st party is much smaller than the variance 
of the 2 nd party. Inspection of the diagram makes clear the fact 
that the 1 st party’s candidate will obtain the votes of almost all 
the members of his own party and will also receive the votes of 
some members of the 2 nd party. Thus the candidate of the 1 st party 
can win even though his party is a minority. 

It is interesting to speculate about the rise of the Nazi party in 
Germany in the light of this result. It is also interesting to consider 
Communist Party participation in the elections in certain countries 
in terms of this result. 


5. Platforms and the General Election 
The analysis of the above section suggests an interesting question. 
Suppose that one or both of the parties is something less than 
purely democratic in the selection of its candidate. Can the party 
improve its chances in the general election by carefully selecting 
a candidate whose personal platform is something of a 4 compro¬ 
mise” between the desires of the members of the candidate’s own 
party and those of the members of the other party? The answer 
seems to be affirmative. Granted the existence of two populations 
(parties), this section is devoted to the demonstration of two 
propositions, both of which depend upon normality. First, it will 
be shown that there exists some convex combination of the two 
vectors of means which is at least as good as any other type of 
strategy. Second, it will be shown that there exists a particular 
convex combination which dominates all others. 

Let fi(w) and f 2 (v) be multivariate normal densities whose 
means are given by (4.1) and variance-covariance matrices by 
(4.2). Let (4.6) and (4.7) define the voting rule. Then for any 
platforms 6 k and 6 2 such that 6i 0 2 , it can be shown by repeating 
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steps (3.5-3.15) that voters from the respective populations choose 
the 1 st candidate if 


> 


(w — 8i)'A(0i — 62 ) ^ || (9i — Si 

II 0*-0.||i* 

(V — 6 2 ) / A(0i — 0 2 ) 


02 - Si 


> 


2 || 0i 

0i — S 2 1 


-02 


= tl 


02 - S 2 


= t 2 


(5.1) 

y ’ || e, - e, I].* "■ 2 ns,-fen. 

where the subscriptions on v and w are omitted for convenience. 
Obviously, if one thinks of selecting a voter at random from each 
of the two populations, y x and y 2 are distributed as standard normal 
variables. Thus a sufficient condition for the 1 st candidate to win 
the election in the respective populations is 

(5.2) P(y ‘ >t,)>K 
P(y 2 > t 2 ) > Vz 

and this requires ti < 0 and t 2 < 0. From (5.1) it is clear that 

(5.2) obtains only if 


0i —& || < || 02 — Si 

0x-8.ll < || 0.-8. || 
so that determining that (5.3) obtains is equivalent to finding that 
the 1 st candidate will win the election. 

Consider the first proposition. Let the 1 st candidate choose a 
convex combination of the two vectors of means. Thus 


(5.3) 


(5.4) 0i=£i8i+ (1 —j3i)8. , 0</h<l 

represents a strategy which is to be shown to win or tie any non- 
convex combination 0 2 chosen by the 2 nd candidate. Note specifi¬ 
cally that since 0 2 is not a convex combination of 8 t and S 2 , 
the strategies 0 2 =Si and 0 2 =S 2 are ruled out. 

Suppose that the 2 nd candidate chooses a platform 0 2 such that 

(5.5) || 02 -Si || >|| 81 - 82 1| 

so that the “distance” from 0 2 to the mean 81 of the 1 st population 
is greater than distance between the two means. Let the 1 st candi¬ 
date choose (3 1=0 so that 0i=S 2 . Then 

(5.6) || 0i—Si || = || Si — S 2 1| < || 02 —Si || 

so that the 1 st candidate wins in the 1 st population. Similarly, 

(5.7) || 0i — 8 * || = || S 2 -S 2 1| =0< || 02 -S 2 1| 

so that the 1 st candidate also wins in the 2 nd population. 

Alternatively, suppose that the 2 nd candidate chooses a platform 
02 such that 
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Then let the 1 st candidate select /3i=l so that 0, = 8, Thus 

(5.9) ||0.-8,|| = || 8,-8, || =0<|| 0,-8, II 

so that the 1 candidate wins in the 1 st population and 

(5.10) || 0.-8, || = || 8,-8, || <|| 0,-8, II 

so that the 1 st candidate also wins in the 2 nd population. 

The above results mean that one only need to consider the case 
in which 0 2 is such that both 

( 511 ) 

11 02 §2 [ | < 11 - 82 [ [ 

obtain. Accordingly, presume that the 2 nd candidate chooses a 0 2 
such that it is not a convex combination of S 1 and 8 2 but does 
satisfy (5.11), By manipulating (5.4) and taking norms of the 
results one can obtain 

(5.12) II «■-«■!! = (1-0.) II 8,-8, || 

II 0.-8, || = A || 8, — 8, || 

Suppose that the 1“ candidate chooses 


' 11 81 8 a 11 

and note that 0 ^ A < 1 by (5.11). Substituting (5.13) into the 
2 nd of the equalities ( 5 . 12 ), 

(5.14) || 02-821[ = || 02 —S 2 1| 

so that the 1 st and 2 nd candidates tie in the 2 nd population. Sub¬ 
stituting (5.13) into the 1 st of the equalities (5.12), 

(5.15) || 0x - 82 1| = IIS2-S2II-II02-S2II 
and noting by the triangle inequality that 

(5.16) || 82 - S 2 1| < || 02 - 8 1 1| + || 02 - S 2 1| 
so that by substituting (5.16) into (5.15) 

(5.17) || 02 - 82 || <|| 02 -82 || 

and the 1 st candidate at worst ties in the 1 st population. In fact, it 
can be shown that for any such that 

(5.18) — ~ifg 11 ~J 1 if’~ Sl 1 L < A 

, 11 62 — 8211 ^ ^ II 82 — 8.1 

where 61 is given by (5.4), the 1 st candidate wins or at worst ties 
in the general election. Note that, by the triangle inequality, there 
is at least one /3i in the interval. The following theorem is estab¬ 
lished: 

Th , eo . r ™; 5>1: Given the assumptions resulting in the voting rule (4.6) 
an ( • )> given that the densities of the preferred positions of the mem¬ 
bers of the two populations are normal, and given that one candidate 
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selects a platform which is not a convex combination of the two vectors 
of means; then there exists a platform which is a convex combination of 
the two vectors of means such that the candidate choosing the latter 
platform will either win or tie in the general election. 


The above theorem means that the strategy of selecting as a 
platform a convex combination of the two vectors of means can be 
at least as good as any other type of platform which can be de¬ 
vised. Therefore, it can be argued that if both candidates are free 
to choose whatever platform they desire, each should select a con¬ 
vex combination of the two vectors of means. 

Suppose not only that 61 is given by (5.4) but also that the 2 nd 
candidate selects a platform 

(5.19) 0.=j3*8i+(1-J308* , 0<A<1 

so that attention is now centered on the instance in which both 
candidates have these convex combinations as their platforms. It 
is to be shown that there exists a particular convex combination 
which wins over all other convex combinations. 

In a manner similar to that in which (5.12) was obtained, one 
may manipulate (5.19) and express the results in terms of norms 


to get 

|| ft-8, II = (1-/3.) || 8.-8. || 

(5 ' 20) |! ft -8. ||= *|| 8, -8. || 

By substituting (5.20) and (5.12) into the conditions (5.3) for 
the 1 st candidate to win in the respective populations, it is clear 
that fit > is required in the 1 st population and 0i < 0 2 is re¬ 
quired in the 2 nd population. Therefore, if the two candidates 
choose the respective platforms (5.4) and (5.19), the fact that 
the 1 st candidate wins in one population implies that the 2 nd can¬ 


didate wins in the other population. 

Recall that the fraction of the total vote going to the 1 st candi- 


clcltG is 

(5.21) R=aP(yi > to + (1 — «)P(y* > t») 

By noting that y t and y 2 are both distributed as unit 
ates, it follows that 


P(yi>ti) = P(yi<- t 0 
(5 ' 22) P(y. > t.) = P(y. < — t s ) 


normal vari- 


so that 

(5.23) R=ceP(y 1 < —t0 + (1 —a)P(y*< — t.) 
is equivalent to (5.21). 
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It is desirable to get the expressions for ti and t 2 into forms more 
suitable for analysis. By the definitions (5.4) of 0i and (5.19) of 
02 , it is easily seen that 

(5.24) 0i 0 2 — (/3i fi 2 )Si ( j3 2 — Pi) 8 2 = (Pi — ) (Sj — 8 2 ) 

so that by taking the appropriate norm 

(5.25) || ft -e, ||.* = |/S.-j3, | || Si — S, ||,* , r=l,2 
where 


(5.26) !! 8i-8 2 \| r * - \/(Si — 8 2 )'A%A (Si — S 2 ) , r=l, 2. 

By recalling the definition (5.1) of ti, substituting from the 1 st of 
the equivalences of (5.12) and (5.20) for the appropriate terms 
in the numerator, and substituting (5.25) for the denominator, 


(5.27) 




ti= 


(2 


2|A 

-fii- 




81 — S 2 

— P*) II 


8i — 8 2 


2|jS.-jS,| || 8.-8. ||.* 
are obtained easily after appropriate manipulation. 

It is now necessary to make an assumption concerning the rela¬ 
tive magnitudes of fit and f} 2 . There are two cases to be considered. 
First, presume that ft > ft so that (ft — (3 2 ) = | ft — (3 2 |. Thus it 
follows from the last expression of (5.27) that 
(5.28) —ti = (1 — k)si , /h > j3 2 
where 


X = 

(5.29) 

Si = 


(Pi ~f 'fix) 

2 

11 8i 8 2 11** 


By repeating tliese steps with respect to t 2 , it is easily seen that 
(5.30) — t 2 = —- Xs 2 ,/3i > j3 2 
where 


(5.31) s 2 


8i 


8i 


and note that both Si and s 2 are positive constants while A. is a 
variable. Expressions (5.28) and (5.30) are useful for analysis. 

It is now necessary to consider the instance in which /3i < /3 2 so 
that (/3i fi 2 ) = — [ /3i — /3 2 |. By noting the last of the expres¬ 

sions in (5.27), it is easily seen that 
(5.32) — ti= — (1 — A)si ,Pi</3 2 
and it also follows that 


(5.33) — t 2 = Xs 2 , j 8 i < /3 2 
and expressions (5.32) and (5.33) are useful for analysis. 
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Define Ri(A) to be the fraction of the total vote going to the 1 st 
candidate when /3i > /&. Then from (5.30), (5.28), and (5.23) 
(5.34) R 1 (A)=aP[y 1 < (1 —A)si] + (1 —a)P[y, < — Xs s ] 
and recalling that yi and y 2 are unit normal variates 


(5.35) 


dRi(A) 

dA 


exp [ — 


so it follows that Rtf' A) is a monotonically decreasing function of 


X in the interval 0 ^ X ^ 1. 

Define R 2 (X) to be the fraction of the total vote going to the 1 st 
candidate when /3 X < /3 2 . Then from (5.33), (5.32) and (5.23) 
(5.36) R 2 (X) =aP[yi < - (1 - X) Sl ] + (1 - «) p [^ < 


but by noting that 

P[ yi <-(1 
(5.37 Jf . . , 


P[y.< (i 


(5.36) can be written 

(5.38) R 2 (X)=1-R 1 (X) 

so that R 2 (X) is a monotonically increasing function of X in the 
interval 0 ^ X ^ 1. 

From the monotonic properties of Ri(A) and R*(X), and from 
expression (5.38), it follows that there must exist a value A* of X 
such that 

(5.39) R 1 (X # )=R 2 (X # )=^ 

Now suppose that the 1 st candidate chooses / 3 i=A*. Then there 
are two cases to be examined. 

Suppose first that A* =/Ji > /3 2 . Thus 

(5.40) A= (/h + /3 a ) 2 < /li^A* 

and as fa > fa, Rj(X) must be examined. Since from (5.40) 
X < X # , and due to the fact that Ri(A) is a monotonically de¬ 
creasing function of X, 

(5.41) Ri(A) > Ri(A*) =% 

so that the 1 st candidate wins the election. 

Suppose next that X* =/3 1 < /3 2 . Then 

(5.42) X= (fti -f- /3 2 )/2 > /3i=A* 

and as ^ < £ 2 ,R 2 (A) must be examined. Since X > X* from 
(5.42), and due to the fact that R 2 (A) is a monotonically in¬ 
creasing function of X. 

(5.43) R 2 (A) >R 2 (A*)=& 

so that the 1 st candidate wins the election. 
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Relations (5.41) and (5.43) show that the particular convex 
combination 

(5.44) 0*=k*8 1 -f (1-X*)S 2 

wins over all other convex combinations. Of course, if both candi¬ 
dates select (5.44), then a tie is expected in the election. The 
following theorem is established: 

Theorem: 5.2: Given the assumptions resulting in the voting rule (4.6) 
and (4.7), given that the densities of the preferred positions of the 
members of the two populations are normal, and given that the two 
candidates select their platforms from the class (5.4) and (5.19), then 
the platform (5.44) is a dominant strategy. 

The above theorems have an interesting implication for the pro¬ 
cess by which parties select candidates. From Section 3 it is clear 
that the vector of means of the preferred positions of the party 
membership exerts a powerful influence upon the platform of a 
candidate emerging from a truly representative, democratic pro¬ 
cess. It can be argued that in the Western World there are power¬ 
ful forces causing the parties to become “more democratic” in 
regard to nominations. Yet, both of the above theorems indicate 
that a party can improve its chances of winning by giving con¬ 
sideration to the preferences of the members of the other party. 
The dilemma of nominations” is that if the party membership is 
not able to take a strategic point of view and if “political bosses” 
are able to take such a point of view, then having the “smoke-filled 
cloakroom nominations” may improve the party’s chances in the 
election. 

There are two unfortunate points to be made. First, the fact 
that the above theorems are separate implies that it is unknown 
whether (5.44) is a dominant strategy overall. Second, the proof 
of theorem 5.2 is not constructive in the sense that the numerical 
value of X* is unknown. Therefore, it may be useful to present a 
simple example,. 

Suppose that Then it is easy to verify from (5.39) that 

(5.45) P[yi < (1 — A*)sx] -f P[y 2 < — X*s 2 ] = l 
or, by noting the last of the relationships (5.37) 

(5.46) P[y, < (1 - \*) Sl ]=P[y 2 < X*s 2 ] 
so that by defining 

(5.47) X* = _Ei_ 

Si -j- s 2 

relationship (5.46) is satisfied. If in addition so that Si = s*, 

then A* and the dominant strategy 
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(5.48) 0* = (8 1 + 8,)/2 

is a simple average of the two vectors of means. 

6 . A Simple Extension 

All of the previous analysis presumes a certain homogeneity of 
the taste of voters in the sense that the matrix A enters all loss 
functions. While this assumption is convenient, it does not allow 
for the simple situation in which some voters do not care about 
some subset of issues. Accordingly, one situation of this type is 
considered here. 

Again let the totality of voters be divided into two mutually 
exclusive and exhaustive populations. Let Wi and v 3 be n com¬ 
ponent vectors representing the preferred positions of the i th voter 
in the 1 st population and the j th voter in the 2 nd population respec¬ 
tively. Then L(w) and f,(v) represent the densities of the pre¬ 
ferred positions. The mean vectors are given by (4.1) and the 
variance-covariance matrices by (4.2). 

Instead of using the matrix A in all loss functions, let 
Lu(0) = (Wi— 0)'Ai(Wi — 6) 

(6,1) L M (e) = (v,-e)'A,(v i -e) 
represent the respective loss functions of the i th and j th voters from 
the 1 st and 2 nd populations. Suppose that both A x and A 2 are singu¬ 
lar nXn matrices and are given by 



where M is an mXm positive definite matrix (m < n) and N is a 
(n — m) X (n — m) positive definite matrix. Note that the specifi¬ 
cation of At means that all voters in the 1 st population obtain pos¬ 
sible utility losses only from the first m components of political 
choice. Therefore, these voters do not care about the last (n — m) 
components of choice. Similarly, the specification of A 2 implies 
that all voters in the 2 nd population obtain possible utility losses 
only from the last (n — m) components of political choice and do 
not care about the first m components. Note that AiA 2 =0. One 
might say that there is no interaction between the desires of the 
two populations. 

It might be observed that the specification (6.2) of A* and A 2 
raises a question concerning the legitimacy of calling Wi and Vj the 
preferred positions of i th and j th voters. The following develop- 
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merits should make clear the fact that the difficulty is entirely 
terminological. 

Presuming the existence of two candidates with the platforms 
0i and 0 2 where 0i 0 2 , it can be shown by postulating the usual 
voting rules and repeating steps (3.5-3.15) that voters from the 
respective populations choose the 1 st candidate if 


y* : 

(6.3) 

y* : 


(w — 8i ) 'At ( 0i — 0 2 ) 11 — 6i 


(v 


|0i-0.|| 

8 ,)' A. ( 0 i 


02 - 8x 


0 ,) 


> 


0 i 


21 

82 


0i -02 lj 


ti 


-=t 2 


,,0i-0 2 || 2 * ' 2\\0i-02\U* 

and note the subscripts on the norms in the numerators of the 
terms on the right of the inequalities. These norms must be ex¬ 
amined in some detail. 

By definition 

|1 0i - 81 l|i 2 —(0i - 8i)'Ai(0i - 81 ) 

^ 6A ^ |1 0i - 82 || 2 2 =(0i - 8 2 )'A 2 (0 — 80 

but the specifications (6.2) of Ai and A 2 indicates that (6.4) can 
be expressed in a more useful manner. Let (0i — 8 t ) m represent 
a vector composed of the first m components of ( 0 X — 81 ). 
Similarly, let (0i — 8 2 ) r represent a vector composed of the last 
r=n — m components of (0i — 8 2 ). Then (6.2) implies 
( 0 i - 8O'Ai(0i - 81 ) = (0i - 8i) m 'M(0i - 80- 
(6-5) (0i - 8 2 )'A 2 (0i - 80 = (0i - 8 2 )/N(0i - 8 2 ) r 
so that only the first m components of ( 0 i — 80 are involved in 
the norm || 0 X — 8 x ||i and only the last r=n — m components of 
(0i — 82 ) are involved in the norm ||0 X — 8 2 1| 2 . By similar defini¬ 
tions and using the same argument 

11 0 2 — 81 1|i 2 = (02 - 8 x ) m 'M (02 - 81 ) m 
(6 ' 6) || 02 - 82 || 2 2 = (02 —8 2 )/N(02 - 8 2 )r 

so that respectively the first m and last r components are involved 
in these norms also. 

Essentially the same phenomenon is observed in the norms in 
the denominators of the terms in (6.3). Let (0i — 0 2 ) m represent 
a vector composed of the first m components of (0i — 0 2 ). Simi¬ 
larly, let ( 0 i — 02 )r represent a vector composed of the last 
r=n — m components of (0i —- 0 2 ). Note that 


(6.7) A4iAi 


U%M 0 
0 0 


A 2 $ 2 A 2 


0 0 
0 N*JN 






204 


MATHEMATICAL APPLICATIONS 


where '$ u is the mXm submatrix made up of the first m rows and 
m columns of % and $ 22 is the rXr submatrix made up of the last 
r rows and r columns of %. Thus. 

(68) II 6 ‘ ~ 6 ‘ ll>* = V(6 1 - - $,). 

follows by definition. 

By combining (6.4) and (6.5) and substituting the result into 
(6.3), and by substituting (6.6) and (6.8) into (6.3), it is easy to 
see the following fact. For all voters in the 1 st population the 
choice of a candidate depends only upon the first m components 
of the vectors w, 0 h and 0 2 . Similarly, for all voters in the 2 nd 
population the choice of a candidate depends only upon the last 
r=n — m components of the vectors v, S 2 , 0i, and 0 2 . It also fol¬ 
lows that the last r components of the vectors w and the first m 
components of the vectors v can be arbitrarily specified without 
affecting the analysis. 

Suppose that 0i is a dominant strategy in the 1 st population and 
0 2 is a dominant platform in the 2 nd population. (If f x ( w) and f 2 ( v) 
are multivariate normals, then 0i=8i and 0 2 =S 2 .) Define a new 
vector 6 which is composed of the first m components of 6 i and 
the last r=n — m components of 0 2 . Then the voters of the 1 st 
population, for whom only the first m components are relevant, 
view 0 as identically the same as 0i. Similarly, the voters in the 
2 nd population, for whom only the last r components are relevant, 
view 0 as identically the same as 0 2 . It follows that 0 is dominant 
in both the 1 st and 2 nd populations and, therefore, wins over any 
other strategy in the general election. The following theorem is 
established: 

Theorem: 6.1: Given the assumptions resulting in the voting rules (6.3) 
and the specification (6.2) of the matrices A x and A 2 , then if 0 t is a 
dominant platform for the 1 st population and 0 2 is a dominant strategy 
for the 2 nd population, the vector 0, which is composed of the first in 
components of 0 V and the last r=n—m components of 0 2 , is a dominant 
strategy for the general election. 

This theorem has a rather intuitive interpretation. Given that 
one of the mutually exclusive and exhaustive groups "desires” one 
set of policies, that the other group “desires” another set of policies, 
and that there is no conflict between the two sets of policies since 
each refers to a mutually exclusive set of issues, then the politician 
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can enhance his chance of winning the election by giving each 
group just what it desires. 

7. Concluding Comments 

No attempt is made here to summarize the results which were 
derived in this analysis. Yet, some of the remarks of the introduc¬ 
tion merit repetition. There is no claim that this simple model of 
policy formation captures the anomalies of the modem, complex, 
political phenomenon. Simplifying assumptions were made to 
reduce the problem to a manageable size so that certain proposi¬ 
tions could be established. It is hoped that these propositions (as 
well as the analysis itself) produce insights into the complexities 
of policy formation in a democratic society. 

One additional remark is warranted. It is clear that certain 
complications, such as multi-party competition under various con¬ 
ditions, can be introduced and analyzed within the broad frame¬ 
work developed here. These additional complications, as well as 
the relaxing of certain of the assumptions, must await the results 
of future efforts. 
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Figure 1 
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Figure 2 
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Figure 3 





